Integrated Intelligent Energy ›› 2026, Vol. 48 ›› Issue (3): 15-26.doi: 10.3969/j.issn.2097-0706.2026.03.002

• Power System Modeling and Control • Previous Articles     Next Articles

Optimization operation of multi-energy complementary system based on multi-agent reinforcement learning

CHEN Feng1,2(), LU Xiaomin3(), LI Mengyang4(), ZHANG Tao1,2(), YANG Fan1,2()   

  1. 1 School of Applied EngineeringHenan University of Science and TechnologySanmenxia 472100, China
    2 Henan Nonferrous Metals New Materials Intelligent Manufacturing Application Engineering Research CenterSanmenxia PolytechnicSanmenxia 472099, China
    3 Inspur Data Technology Company LimitedZhengzhou 450003, China
    4 School of Electrical Engineering and AutomationLuoyang Normal UniversityLuoyang 471942, China
  • Received:2025-07-18 Revised:2025-12-08 Published:2026-03-25
  • Supported by:
    National Natural Science Foundation of China(62401240); Key Scientific Research Project Plan of Colleges and Universities in Henan Province in 2026(26B480004)

Abstract:

To address the dynamic coordinated optimization challenges of multi-energy complementary systems under high renewable energy integration and the limitations of traditional centralized methods in multi-agent interest coordination and real-time response, dynamic optimization modeling research was conducted. A three-layer multi-agent reinforcement learning (MARL) framework—consisting of a physical layer, decision layer, and coordination layer—was developed, with energy producers, consumers, and system schedulers classified as independent agents. Based on the improved proximal policy optimization algorithm, a dynamic reward function integrating economic efficiency, environmental friendliness, and stability was designed, and distributed decision-making with global coordination was achieved through a centralized training-decentralized execution mechanism. A typical park-level multi-energy complementary system was used as a case study. The results showed that the proposed MARL model increased the renewable energy consumption rate to 92.3%, reducing the unit electricity cost by 28.9% compared to the traditional mixed integer programming (MIP) method. Under a 50% load abrupt change scenario, the system recovery time was shortened to 90 s, which was 900% faster than the MIP method. Even with ±20% wind and solar forecasting errors, the load satisfaction rate remained at 98.7%. This dynamic optimization model effectively addressed the multi-agent coordination and uncertainty adaptation challenges in multi-energy complementary systems, providing technical support for the real-time optimization and scheduling of high-penetration renewable energy systems.

Key words: multi-energy complementary system, multi-agent reinforcement learning, dynamic optimization modeling, source-grid-load-storage, renewable energy consumption, coordinated scheduling

CLC Number: