nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv qikanlogo popupnotification paper paperNew
2025, 4, 21-29
基于深度强化学习的综合能源系统优化调度
基金项目(Foundation): 河北省自然科学基金青年项目“新型电力系统下共享储能市场多元主体互动机制研究”(E2024502048)
邮箱(Email):
DOI: 10.19929/j.cnki.nmgdljs.2025.0045
摘要:

为减少智能体达到收敛所需的训练轮数,提高经验样本利用效率,优化综合能源系统(Integrated Energy System,IES)能量调度,引入深度强化学习(Deep Reinforcement Learning,DRL)算法,提出一种基于多环境实例和数据特征分数经验采样机制的改进深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法。首先,借助多环境实例促使智能体和环境进行大量交互,从而获得有效的指导经验;其次,对不同类型数据进行特征量化处理,并依据特征分数进行经验采样,提高样本利用效率;最后,将改进DDPG算法与经典柔性动作-评价(Soft Actor-Critic,SAC)算法、双延迟深度确定性策略梯度(TwinDelayed Deep Deterministic Policy Gradient,TD3)算法进行对比实验,实验结果验证了所提算法在提高收敛速度和样本利用效率方面的有效性,并通过算例仿真对模型增量学习后的性能提升进行了验证。

Abstract:

In order to reduce the number of training rounds required for agent convergence, improve the utilization efficiency of experience samples, and optimize the energy scheduling of the integrated energy system(IES), this paper introduces deep reinforcement learning(DRL) algorithm and proposes an improved deep deterministic policy gradient(DDPG) algorithm based on multi-environment instances and data feature score experience sampling. Firstly, multiple environment instances are introduced to enable extensive interactions between the agent and the environment, thereby obtaining informative experience. Secondly, feature quantization is applied to different data and experience sampling is performed according to feature to enhance sample utilization efficiency. Finally, the improved DDPG algorithm is compared with the classical soft actor-critic (SAC) and twin delayed deep deterministic policy gradient(TD3) algorithms, validating the effectiveness of the proposed method in improving convergence speed and sample utilization efficiency. Furthermore, the performance improvement of the model through incremental learning is verified through case simulations.

参考文献

[1] 严干贵,陈佰会,王佳琪,等.计及综合需求响应的综合能源系统主从博弈优化调度策略[J].东北电力大学学报,2024,44(4):46-55,85. YAN Gangui, CHEN Baihui, WANG Jiaqi, et al. Master-Slave Game Optimal Scheduling Strategy of Integrated Energy System Considering Integrated Demand Response[J]. Journal of Northeast Dianli University, 2024, 44(4): 46-55, 85.

[2] 罗清局,朱继忠.基于改进交替方向乘子法的电-气综合能源系统优化调度[J].电工技术学报,2024,39(9):2797-2809. LUO Qingju, ZHU Jizhong. Optimal Dispatch of Integrated Electricity and Gas System Based on Modified Alternating Direction Method of Multipliers[J]. Transactions of China Electrotechnical Society, 2024, 39(9): 2797-2809.

[3] 呼斯乐,王渊,于源,等.考虑风光不确定性与灵活性指标的园区综合能源系统最优调度[J].内蒙古电力技术,2024,42(1):15-21. Husile, WANG Yuan, YU Yuan, et al. Optimal Dispatching of Integrated Energy System in Industrial Parks Considering Wind and Solar Uncertainty and Flexibility Indicators[J]. Inner Mongolia Electric Power, 2024, 42(1): 15-21.

[4] 秦晓栋,宋瑞军,王安,等.风光火储自适应SMPC调度策略研究[J/OL].内蒙古电力技术,1-8[2025-06-23].https://link.cnki.net/ urlid/15.1200.TM.20250428.1514.002. QIN Xiaodong, SONG Ruijun, WANG An, et al. Research on Adaptive SMPC Scheduling Strategy for Wind-Photovoltaic-Thermal-Storage System[J/OL]. Inner Mongolia Electric Power, 1-8[2025-06-23]. https://link.cnki.net/urlid/15.1200.TM.20250428.1514.002.

[5] 吉兴全,刘健,叶平峰,等.计及灵活性与可靠性的综合能源系统优化调度[J].电力系统自动化,2023,47(8):132-144. JI Xingquan, LIU Jian, YE Pingfeng, et al. Optimal Scheduling of Integrated Energy System Considering Flexibility and Reliability[J]. Automation of Electric Power Systems, 2023, 47(8): 132-144.

[6] Kober J, Peters J. Reinforcement learning in robotics: A survey [J]. International Journal of Robotics Research, 2013, 32(11): 1238-1274.

[7] Ghesu F G, Krubasik E, Georgescu B. Marginal space deficient architecture for volumetric image parsing[J]. IEEE Transactions on Medical Imaging, 2016, 35(5): 1217-1228.

[8] Silver D, Huang A, Maddison C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484-489.

[9] 胡浩,赵昌军,刘璟,等.基于随机博弈与A3C深度强化学习的网络防御策略优选[J].指挥与控制学报,2024,10(1):47-58. HU Hao, ZHAO Changjun, LIU Jing, et al. Network Defense Strategy Optimization Based on Stochastic Gaming and A3C Deep Reinforcement Learning[J]. Journal of Command and Control,,2024, 10(1): 47-58.

[10] 王鹏程.基于深度强化学习的非完备信息机器博弈研究[D].哈尔滨:哈尔滨工业大学,2016.

[11] 李宝帅,叶春明.深度强化学习算法求解作业车间调度问题[J]. 计算机工程与应用,2021,57(23):248-254. LI Baoshuai, YE Chunming. Job Shop Scheduling Problem Based on Deep Reinforcement Learning[J]. Computer Engineering and Application, 2021, 57(23): 248-254.

[12] Ipek E, Mutlu O, Martinez J F, et al. Self-optimizing memory controllers: A reinforcement learning approach[J]. Computer Architecture, 2008, 36(3): 39-50.

[13] 唐振韬,邵坤,赵冬斌,等.深度强化学习进展:从AlphaGo到AlphaGo Zero[J].控制理论与应用,2017,34(12):1529-1546. TANG Zhentao, SHAO Kun, ZHAO Dongbin, et al. Recent progress of deep reinforcement learning:from AlphaGo to AlphaGo Zero[J]. Control Theory & Applications, 2017, 34(12): 1529-1546.

[14] 陈实,郭正伟,周步祥,等.考虑5G基站储能可调度容量的有源配电网协同优化调度方法[J].电网技术,2023,47(12):5225-5241. CHEN Shi, GUO Zhengwei, ZHOU Buxiang, et al. Coordinated Optimization Scheduling for Active Distribution Networks Considering Schedulable Capacity of Energy Storage for 5G Base Stations[J]. Power System Technology, 2023, 47(12): 5225-5241.

[15] 张自东,邱才明,张东霞,等.基于深度强化学习的微电网复合储能协调控制方法[J].电网技术,2019,43(6):1914-1921. ZHANG Zidong, QIU Caiming, ZHANG Dongxia, et al. A Coordinated Control Method for Hybrid Energy Storage System in Microgrid Based on Deep Reinforcement Learning [J]. Power System Technology, 2019, 43(6): 1914-1921.

[16] 王新,张良,任晓龙,等.融合图神经网络模型与强化学习的综合能源系统优化调度[J].电力系统保护与控制,2023,51(24):102-110. WANG Xin, ZHANG Liang, REN XiaoLong, et al. Optimal scheduling of integrated energy systems by fusing a graph neural network model and reinforcement learning[J]. Power System Protection and Control. 2023, 51(24): 102-110.

[17] 张薇,王浚宇,杨茂,等.基于分布式双层强化学习的区域综合能源系统多时间尺度优化调度[J].电工技术学报,2025,40(11): 3529-3544. ZHANG Wei, WANG Junyu, YANG Mao, et al. The Multi-Time-Scale Optimal Scheduling for Regional Integrated Energy System Based on the Distributed Bi-Layer Reinforcement Learning[J]. Transactions of China Electrotechnical Society, 2025, 40(11): 3529-3544.

[18] 乔骥,王新迎,张擎,等.基于柔性行动器-评判器深度强化学习的电-气综合能源系统优化调度[J].中国电机工程学报,2021, 41(3):819-833. QIAO Ji, WANG Xinying, ZHANG Qing, et al. Optimal Dispatch of Integrated Electricity-Gas System with Soft Actor-Critic Deep Reinforcement Learning[J]. Proceedings of the CSEE, 2021, 41(3): 819-833.

[19] 董雷,刘雨,乔骥,等.基于多智能体深度强化学习的电热联合系统优化运行[J].电网技术,2021,45(12):4729-4738. DONG Lei, LIU Yu, QIAO Ji, et al. Optimal Dispatch of Combined Heat and Power System Based on Multi-Agent Deep Reinforcement Learning[J]. Power System Technology, 2021, 45(12): 4729-4738.

[20] 邱彬,刘亿申,王凯,等.计及动态碳因子与绿证-碳交易的社区综 合能 源系 统供 需主 从博 弈优 化[J/OL]. 发电 技术 ,1-13[2025-06-03].https://link.cnki.net/urlid/33.1405.TK.20250722.1517.004.

[21] 刘俊峰,陈剑龙,王晓生,等.基于深度强化学习的微能源网能量管理与优化策略研究[J].电网技术,2020,44(10):3794-3803. LIU Junfeng, CHEN JianLong, WANG Xiaosheng, et al. Energy Management and Optimization of Multi-Energy Grid Based on Deep Reinforcement Learning[J]. Power System Technology, 2020, 44(10): 3794-3803.

[22] 李璞,王奕,张博.基于深度学习的电力变压器匝间短路故障辨识方法[J].变压器,2025,62(6):41-50. LI Pu, WANG Yi, ZHANG Bo. Fault Identification Method for Inter-Turn Short Circuit of Power Transformer Based on Deep Learning[J]. Transformer, 2025, 62(6): 41-50.

[23] 邓子琦.考虑储氢运行策略的综合能源信息物理系统可靠性评估[J].东北电力技术,2025,46(3):33-41,47. DENG Ziqi. Reliability Evaluation of Integrated Energy Information Physical System Considering Hydrogen Storage Operational Strategies[J]. Northeast Electric Power Technology, 2025, 46(3): 33-41, 47.

基本信息:

DOI:10.19929/j.cnki.nmgdljs.2025.0045

中图分类号:

引用信息:

[1]梁海峰,闫峰,尚隽,等.基于深度强化学习的综合能源系统优化调度[J],2025,43(4):21-29.DOI:10.19929/j.cnki.nmgdljs.2025.0045.

基金信息:

河北省自然科学基金青年项目“新型电力系统下共享储能市场多元主体互动机制研究”(E2024502048)

检 索 高级检索