综合智慧能源 ›› 2025, Vol. 47 ›› Issue (7): 44-54.doi: 10.3969/j.issn.2097-0706.2025.07.005

• 博弈论与电力市场决策 • 上一篇    下一篇

基于双层网络PER-MADDPG算法的综合能源系统协调优化调度

陈亮(), 刘桂英*(), 粟时平(), 唐长久, 王辰浩, 郭思桐   

  1. 长沙理工大学 电气与信息工程学院,长沙 410114
  • 收稿日期:2025-02-11 修回日期:2025-03-17 出版日期:2025-07-25
  • 通讯作者: *刘桂英(1964),女,副教授,硕士,从事电力系统运行与控制、新能源发电并网等方面的研究,3281437417@qq.com
  • 作者简介:陈亮(2000),男,硕士生,从事电力系统运行与控制方面的研究,1130714459@qq.com
    粟时平(1963),男,教授,博士,从事能量路由器、电能质量、智能电网、新能源发电并网等方面的研究,sushiping@126.com
  • 基金资助:
    湖南省自然科学基金项目(2023JJ40053)

Coordinated optimization scheduling of integrated energy system based on PER-MADDPG algorithm with two-layer network

CHEN Liang(), LIU Guiying*(), SU Shiping(), TANG Changjiu, WANG Chenhao, GUO Sitong   

  1. School of Electrical and Information Engineering, Changsha University of Science and Technology, Changsha 410114, China
  • Received:2025-02-11 Revised:2025-03-17 Published:2025-07-25
  • Supported by:
    Natural Science Foundation of Hunan Province of China(2023JJ40053)

摘要:

为保证综合能源系统(IES)的经济运行,针对传统模型驱动调度方法存在的优化调度模型求解困难、收敛速度慢、效果不理想等问题,提出一种基于能量路由器的IES协调优化调度方法。采用电、热、冷3个能量路由器将IES分为3个区域,对能量设备进行建模,构建IES优化调度的马尔可夫合作博弈模型,形成集中训练、分布执行的框架。采用基于改进双层Actor-Critic网络的多智能体深度确定性策略梯度算法,通过双层Critic网络评估动作价值,以避免动作价值过估计问题,同时引入优先经验回放机制,在数据多样性不变的情况下,提高经验回放池中数据的利用率。通过算例仿真验证了所提算法比未改进之前计算速度快10.13 s,日均调度成本少1 638.13元,可在保证系统经济性的前提下,实现IES的协调优化调度。

关键词: 综合能源系统, 协调优化调度, 马尔可夫博弈, 能量路由器, 双层Actor-Critic网络, 优先经验回放机制

Abstract:

To ensure the economic operation of an integrated energy system(IES), a coordinated optimization scheduling method for IES based on energy routers is proposed to address the issues such as difficulty in solving optimization scheduling models, slow convergence, and unsatisfactory performance in traditional model-driven scheduling methods. The IES was divided into three regions using three electrical, thermal, and cooling energy routers. Energy devices were modelled, and a Markov cooperative game model for IES optimization scheduling was established, forming a framework of centralized training and distributed execution. A Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm based on an improved two-layer Actor-Critic network was used, where the two-layer Critic network evaluated action values to avoid their overestimation. Additionally, a prioritized experience replay mechanism was incorporated to improve data utilization efficiency in the experience replay pool without compromising the diversity of data. The simulation results showed that the proposed algorithm was 10.13 s faster in calculation speed and reduced daily scheduling costs by 1 638.13 yuan compared to the unimproved method, achieving coordinated optimization scheduling of IES while ensuring system economic efficiency.

Key words: integrated energy system, coordinated optimization scheduling, Markov game, energy router, two-layer Actor-Critic network, prioritized experience replay mechanism

中图分类号: