基于双层网络PER-MADDPG算法的综合能源系统协调优化调度

doi:10.3969/j.issn.2097-0706.2025.07.005

摘要/Abstract

摘要：

为保证综合能源系统（IES）的经济运行，针对传统模型驱动调度方法存在的优化调度模型求解困难、收敛速度慢、效果不理想等问题，提出一种基于能量路由器的IES协调优化调度方法。采用电、热、冷3个能量路由器将IES分为3个区域，对能量设备进行建模，构建IES优化调度的马尔可夫合作博弈模型，形成集中训练、分布执行的框架。采用基于改进双层Actor-Critic网络的多智能体深度确定性策略梯度算法，通过双层Critic网络评估动作价值，以避免动作价值过估计问题，同时引入优先经验回放机制，在数据多样性不变的情况下，提高经验回放池中数据的利用率。通过算例仿真验证了所提算法比未改进之前计算速度快10.13 s，日均调度成本少1 638.13元，可在保证系统经济性的前提下，实现IES的协调优化调度。

关键词: 综合能源系统, 协调优化调度, 马尔可夫博弈, 能量路由器, 双层Actor-Critic网络, 优先经验回放机制

Abstract:

To ensure the economic operation of an integrated energy system（IES）， a coordinated optimization scheduling method for IES based on energy routers is proposed to address the issues such as difficulty in solving optimization scheduling models， slow convergence， and unsatisfactory performance in traditional model-driven scheduling methods. The IES was divided into three regions using three electrical， thermal， and cooling energy routers. Energy devices were modelled， and a Markov cooperative game model for IES optimization scheduling was established， forming a framework of centralized training and distributed execution. A Multi-Agent Deep Deterministic Policy Gradient （MADDPG） algorithm based on an improved two-layer Actor-Critic network was used， where the two-layer Critic network evaluated action values to avoid their overestimation. Additionally， a prioritized experience replay mechanism was incorporated to improve data utilization efficiency in the experience replay pool without compromising the diversity of data. The simulation results showed that the proposed algorithm was 10.13 s faster in calculation speed and reduced daily scheduling costs by 1 638.13 yuan compared to the unimproved method， achieving coordinated optimization scheduling of IES while ensuring system economic efficiency.

Key words: integrated energy system, coordinated optimization scheduling, Markov game, energy router, two-layer Actor-Critic network, prioritized experience replay mechanism

中图分类号:

TM711：TK01

陈亮, 刘桂英, 粟时平, 唐长久, 王辰浩, 郭思桐. 基于双层网络PER-MADDPG算法的综合能源系统协调优化调度[J]. 综合智慧能源, 2025, 47(7): 44-54.

CHEN Liang, LIU Guiying, SU Shiping, TANG Changjiu, WANG Chenhao, GUO Sitong. Coordinated optimization scheduling of integrated energy system based on PER-MADDPG algorithm with two-layer network[J]. Integrated Intelligent Energy, 2025, 47(7): 44-54.

图/表 13

图1

表1

图2

图3

图4

表2

各设备性能参数

设备	参数	数值
WT	最大功率/kW	500
PV	最大功率/kW	600
MT	发电效率 $η M T$	0.35
MT	最大功率/kW	200
WH	换热效率 $η W H$	0.75
WH	最大功率/kW	120
GB	转换效率 $η G B$	0.9
GB	最大功率/kW	100
EC	转换系数 $η E C$	4.5
EC	最大功率/kW	300
EES，TES，CES	充能效率 $η i n$	0.95
EES，TES，CES	放能效率 $η o u t$	0.90

表2

表3

超参数设置

超参数	取值	超参数	取值
学习率	5×10^-3	折扣因子 $γ$	0.9
贪婪指数 ε	0.9	参数更新步数	200
经验回放池容量 D	3 000	软更新系数λ	1×10^-3
批量样本大小 K	32	训练周期	96

表3

图5

图6

图7

图8

图9

表4

参考文献 26

[1]	张苏涵, 顾伟, 陆帅, 等. 综合能源系统分析: 从数值到解析(三): 混合时间尺度经济调度[J/OL]. 中国电机工程学报, 2024:1-13 (2024-12-03)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=ZGDC20241113003&dbname=CJFD&dbcode=CJFQ.
	ZHANG Suhan, GU Wei, LU Shuai, et al. Analysis of comprehensive energy system: from numerical value to analysis(Ⅲ):mixed time scale economic dispatch[J/OL]. China Industrial Economics, 2024:1-13(2024-12-03)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=ZGDC20241113003&dbname=CJFD&dbcode=CJFQ.
[2]	JIA J D, LI H Q, WU D, et al. Multi‑objective optimization study of regional integrated energy systems coupled with renewable energy, energy storage, and inter‑station energy sharing[J]. Renewable Energy, 2024, 225: 120328.
[3]	ZHENG W Y, LU H, ZHANG M L, et al. Distributed energy management of multi‑entity integrated electricity and heat systems: A review of architectures, optimization algorithms and prospects[J]. IEEE Transactions on Smart Grid, 2024, 15(2): 1544-1561.
[4]	盛煊, 林舜江, 梁炜焜, 等. 考虑灵活性的城市综合能源系统区间优化调度[J/OL]. 中国电机工程学报, 2024: 1-12(2024-05-20)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=ZGDC20240515001&dbname=CJFD&dbcode=CJFQ.
	SHENG Xuan, LIN Shunjiang, LIANG Weikun, et al. Interval optimal scheduling of urban comprehensive energy system considering flexibility[J/OL]. China Industrial Economics, 2024: 1-12.(2024-05-20)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=ZGDC20240515001&dbname=CJFD&dbcode=CJFQ.
[5]	SHI M G, WANG H, XIE P, et al. Distributed energy scheduling for integrated energy system clusters with peer‑to‑peer energy transaction[J]. IEEE Transactions on Smart Grid, 2023, 14(1): 142-156.
[6]	王京龙, 王晖, 杨野, 等. 考虑多重不确定性的电-热-气综合能源系统协同优化方法[J]. 综合智慧能源, 2024, 46(4):42-51. doi: 10.3969/j.issn.2097-0706.2024.04.006
	WANG Jinglong, WANG Hui, YANG Ye, et al. Collaborative optimization method for power-heat-gas integrated energy systems considering multiple uncertainties[J]. Integrated Intelligent Energy, 2024, 46(4): 42-51. doi: 10.3969/j.issn.2097-0706.2024.04.006
[7]	ZHU Y A, WU S Q, LI J Y, et al. Towards a carbon‑neutral community:Iintegrated renewable energy systems(IRES)—Sources,storage,optimization,challenges,strategies and opportunities[J]. Journal of Energy Storage, 2024, 83:110663.
[8]	李铂航, 李宏仲, 张民元. 计及负荷特性的综合能源系统低碳经济调度[J]. 综合智慧能源, 2023, 45(8): 72-79. doi: 10.3969/j.issn.2097-0706.2023.08.009
	LI Bohang, LI Hongzhong, ZHANG Minyuan. Low‑carbon economic dispatch of integrated energy systems considering load characteristics[J]. Integrated Intelligent Energy, 2023, 45(8): 72-79. doi: 10.3969/j.issn.2097-0706.2023.08.009
[9]	刘彦茹, 张蕊萍, 董海鹰. 基于改进概率规划算法的农业园区综合能源系统优化调度[J]. 综合智慧能源, 2023, 45(10): 35-43. doi: 10.3969/j.issn.2097-0706.2023.10.005
	LIU Yanru, ZHANG Ruiping, DONG Haiying. Optimal scheduling of integrated energy systems in agricultural parks based on improved probability planning algorithm[J]. Integrated Intelligent Energy, 2023, 45(10): 35-43. doi: 10.3969/j.issn.2097-0706.2023.10.005
[10]	柳思贤, 丁坤, 董海鹰. 考虑碳捕集和电转气的零碳园区综合能源系统经济调度[J]. 太阳能学报, 2024, 45(9): 188-196.
	LIU Sixian, DING Kun, DONG Haiying. Zero‑carbon economic dispatch of park integrated energy system considering carbon capture and power to gas[J]. Acta Energiae Solaris Sinica, 2024, 45(9): 188-196.
[11]	李明扬, 窦梦园. 基于强化学习的含电动汽车虚拟电厂优化调度[J]. 综合智慧能源, 2024, 46(6): 27-34. doi: 10.3969/j.issn.2097-0706.2024.06.004
	LI Mingyang, DOU Mengyuan. Optimal scheduling of virtual power plants integrating electric vehicles based on reinforcement learning[J]. Integrated Intelligent Energy, 2024, 46(6): 27-34. doi: 10.3969/j.issn.2097-0706.2024.06.004
[12]	沈运帷, 徐凯, 林顺富, 等. 考虑广义储能参与的多园区综合能源系统低碳优化运行策略[J]. 电力自动化设备, 2024, 44(11): 41-51.
	SHEN Yunwei, XU Kai, LIN Shunfu, et al. Low‑carbon optimal operation strategy of multi‑park integrated energy system considering generalized energy storage participation[J]. Electric Power Automation Equipment, 2024, 44(11): 41-51.
[13]	李杰, 高爽, 袁博兴, 等. 采用融合遗传算法的高速公路服务区综合能源系统优化调度研究[J]. 西安交通大学学报, 2024, 58(5): 200-211.
	LI Jie, GAO Shuang, YUAN Boxing, et al. Research on optimal scheduling of integrated energy system in highway service area based on a genetic algorithm‑sequential quadratic programming algorithm[J]. Journal of Xi'an Jiaotong University, 2024, 58(5): 200-211.
[14]	华昊辰, 翟家祥, 陈星莺, 等. 基于增强精英保留遗传算法的虚拟微网群动态划分及能量局域自治[J]. 中国电机工程学报, 2024, 44(12): 4652-4666.
	HUA Haochen, ZHAI Jiaxiang, CHEN Xingying, et al. Dynamical partitioning and local energy autonomy of virtual microgrid groups based on strengthen elitist genetic algorithm[J]. Proceedings of the CSEE, 2024, 44(12): 4652-4666.
[15]	黄冬梅, 余京朋, 崔承刚, 等. 多时间尺度深度强化学习光储配电网电压优化[J/OL]. 中国电机工程学报, 2024: 1-12(2024-10-15)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=ZGDC20241014002&dbname=CJFD&dbcode=CJFQ.
	HUANG Dongmei, YU Jingpeng, CUI Chenggang, et al. Voltage optimization of multi‑time scale deep reinforcement learning optical storage distribution network[J/OL]. China Industrial Economics, 2024: 1-12(2024-10-15)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=ZGDC20241014002&dbname=CJFD&dbcode=CJFQ.
[16]	唐昊, 张庆虎, 方道宏, 等. 需求响应下基于深度强化学习的综合能源系统能量管理策略[J/OL]. 控制理论与应用, 2024: 1-11(2024-10-11)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=KZLY20241010004&dbname=CJFD&dbcode=CJFQ.
	TANG Hao, ZHANG Qinghu, FANG Daohong, et al. Energy management strategy of comprehensive energy system based on deep reinforcement learning under demand response[J/OL]. China Industrial Economics, 2024: 1-11(2024-10-11)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=KZLY20241010004&dbname=CJFD&dbcode=CJFQ.
[17]	YANG Y H, MA T F, LI H T, et al. Federated double DQN based multi‑energy microgrid energy management strategy considering carbon emissions[J]. Global Energy Interconnection, 2023, 6(6): 689-699.
[18]	张薇, 王浚宇, 杨茂, 等. 基于分布式双层强化学习的区域综合能源系统多时间尺度优化调度[J/OL]. 电工技术学报, 2024: 1-16(2024-09-04)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=DGJS20240903002&dbname=CJFD&dbcode=CJFQ.
	ZHANG Wei, WANG Junyu, YANG Mao, et al. Multi‑time scale optimal scheduling of regional comprehensive energy system based on distributed double‑layer reinforcement learning[J/OL]. China Industrial Economics, 2024:1-16 (2024-09-04)[2025-02-08]. https://kns.cnki.net/KCMS/detail/detail.aspx?filename=DGJS20240903002&dbname=CJFD&dbcode=CJFQ.
[19]	范邦稷, 刘兴华, 丁涛, 等. 基于图注意网络与深度确定性策略梯度的三相主动配电网供电恢复方法[J]. 中国电机工程学报, 2023, 43(21): 8193-8206.
	FAN Bangji, LIU Xinghua, DING Tao, et al. A GAT-DDPG based approach for three‑phase active distribution system restoration[J]. Proceedings of the CSEE, 2023, 43(21):8193-8206.
[20]	WANG Q H, WANG Y B, CHEN Z, et al. Multi‑agent system consistency‑based cooperative scheduling strategy of regional integrated energy system[J]. Energy, 2024, 295: 130904.
[21]	梅铭洋, 寇鹏, 张智豪, 等. 面向主动配电网的安全多智能体深度强化学习电压优化控制[J]. 西安交通大学学报, 2023, 57(12): 157-167.
	MEI Mingyang, KOU Peng, ZHANG Zhihao, et al. Safe multi‑agent deep reinforcement learning for optimal voltage control in active distribution networks[J]. Journal of Xi'an Jiaotong University, 2023, 57(12): 157-167.
[22]	席磊, 全悦, 刘治洪, 等. 基于自适应强化探索悲观Q的多智能体协同AGC算法[J]. 高电压技术, 2023, 49(6): 2286-2298.
	XI Lei, QUAN Yue, LIU Zhihong, et al. Multi‑agent collaborative AGC algorithm based on self‑adaptive reinforcement‑exploration maxmin Q[J]. High Voltage Engineering, 2023, 49(6): 2286-2298.
[23]	徐业琰, 姚良忠, 廖思阳, 等. 基于多智能体Actor-double-critic深度强化学习的源-网-荷-储实时优化调度方法[J]. 中国电机工程学报, 2025, 45(2): 513-527.
	XU Yeyan, YAO Liangzhong, LIAO Siyang, et al. Real‑time optimal dispatch method of source-grid-load-storage based on multi‑agent actor-double-critic deep reinforcement learning[J]. Proceedings of the CSEE, 2025, 45(2):513-527.
[24]	王育飞, 王义顺, 薛花, 等. 基于改进MADDPG算法的储能系统多目标优化调度策略[J]. 电力系统自动化, 2024, 48(23): 98-111.
	WANG Yufei, WANG Yishun, XUE Hua, et al. Multi‑objective optimal scheduling strategy for energy storage system based on improved multi‑agent deep deterministic policy gradient algorithm[J]. Automation of Electric Power Systems, 2024, 48(23): 98-111.
[25]	彭春华, 易泰洵, 孙惠娟, 等. 基于多智能体深度确定性策略梯度的电碳耦合市场发电商均衡竞价策略[J]. 电网技术, 2023, 47(10): 4229-4239.
	PENG Chunhua, YI Taixun, SUN Huijuan, et al. Power generator balanced bidding based on multi‑agent deep deterministic strategy gradient in electricity-carbon coupling market[J]. Power System Technology, 2023, 47(10): 4229-4239.
[26]	李建标, 陈建福, 高滢, 等. 基于RG-DDPG的直流微网能量管理策略[J]. 中国电力, 2023, 56(7): 85-94.
	LI Jianbiao, CHEN Jianfu, GAO Ying, et al. Strategy for DC microgrid energy management based on RG-DDPG[J]. Electric Power, 2023, 56(7): 85-94.

系数	取值
Φ/[元·(kW·h)^-1]	0.216 7
α/[元·(kW·h)^-1]	0.35
δ/[元·(kW·h)^-1]	0.8
c_p/[元·(kW·h)^-1]	1.85
c_f/(元·m^-3)	3.45
β/[元·(kW·h)^-1]	0.4
ψ/[元·(kW·h)^-1]	0.25
T	24

项目	本文算法	MADDPG	MAPPO	DDPG	DQN
日均调度成本/元	6 125.30	7 763.43	8 952.30	9 756.90	9 786.20
计算时间/s	2.54	12.67	40.22	278.83	286.37