综合智慧能源

• •    

多智能体强化学习的多能互补系统优化建模

陈锋, 路小敏, 胡可, 沈冰, 王军鹏   

  1. 河南科技大学应用工程学院应用工程学院, 河南 472000 中国
    郑州浪潮数据技术有限公司研发部, 河南 450000 中国
    华北水利水电大学电气工程学院, 河南 450018 中国
    郑州航空工业管理学院电气工程学院, 河南 450018 中国
  • 收稿日期:2025-07-18 修回日期:2025-12-05
  • 基金资助:
    国家自然科学基金项目(72342104)

Dynamic optimization modeling of multi-energy complementary energy system based on multi-agent reinforcement learning

LU Xiaomin   

  1. , 472000, China
    , 450000, China
    , 450018, China
  • Received:2025-07-18 Revised:2025-12-05
  • Supported by:
    National Natural Science Foundation of China(72342104)

摘要: 为解决多能互补能源系统在高比例可再生能源接入下的动态协同优化难题,以及传统集中式方法在多主体利益协调和实时响应中的局限性,开展动态优化建模研究。构建“物理层-决策层-协同层”三层多智能体强化学习框架,将能源生产者、消费者及系统调度器划分为独立智能体。基于改进近端策略优化算法,设计融合经济性、环保性与稳定性的动态奖励函数,通过集中训练-分散执行机制实现分布式决策与全局协同。选取典型的园区级多能互补系统为算例验证显示:所提模型使可再生能源消纳率提升至95.3%,度电成本降低18.7%;在50%负荷突变场景下,系统恢复稳定时间缩短至90秒,较传统混合整数规划方法减少90%;面对±20%风光预测误差,负荷满足率仍保持98.7%。该动态优化模型可有效解决多能互补系统的多主体协同与不确定性适应问题,为高渗透率可再生能源系统的实时优化调度提供技术支撑。

关键词: 多能互补能源系统, 多智能体强化学习, 动态优化建模, 源网荷储协同, 可再生能源消纳, 协同调度

Abstract: In order to solve the problem of dynamic collaborative optimization of multi-energy complementary energy system under high proportion of renewable energy access,and the limitations of traditional centralized methods in multi-agent interest coordination and real-time response,dynamic optimization modeling research is carried out.A three-layer multi-agent reinforcement learning framework of “ physical layer-decision layer-collaborative layer ” is constructed,which divides energy producers,consumers and system dispatchers into independent agents.Based on the improved proximal strategy optimization algorithm,a dynamic reward function integrating economy,environmental protection and stability is designed,and distributed decision-making and global coordination are realized through centralized training-decentralized execution mechanism.A typical park-level multi-energy complementary system is selected as an example to verify the proposed model.The results show that the proposed model can increase the renewable energy consumption rate to 95.3 % and reduce the cost of electricity by 18.7 %.In the 50 % load mutation scenario,the system recovery and stabilization time is shortened to 90 seconds,which is 90 % less than the traditional mixed integer programming method. In the face of ±20% wind and solar prediction error,the load satisfaction rate remains 98.7 %.The dynamic optimization model can effectively solve the problem of multi-agent coordination and uncertainty adaptation of multi-energy complementary systems,and provide technical support for real-time optimal scheduling of high-permeability renewable energy systems.

Key words: Multi-energy complementary energy system, multi-agent reinforcement learning, dynamic optimization modeling, source network load storage coordination, renewable energy consumption, collaborative scheduling