Offline DRL for Price-Based Demand Response: Learning From Suboptimal Data and Beyond

IEEE Transactions on Smart Grid(2024)

引用 0|浏览8
暂无评分
摘要
Demand response providers (DRPs) play a crucial role in the retail electricity markets as they bridge the gap between the distribution systems operator (DSO) and end participants. The DRPs’ primary objective is to devise a pricing strategy that maximizes their profits, without access to private customer preferences. The vanilla deep reinforcement learning (DRL) paradigm is inapplicable in this context, as they require iterative data collection through interaction with the environment. To address this challenge, we propose an offline DRL-based approach that enables the DRPs to learn pricing strategies from static suboptimal data, without the need for any online interaction. The proposed approach updates DRPs’ Q-value via incorporating a in-distribution behavior decoder and regularization terms to prevent the overestimation caused by out-of-distribution experiences. It is designed to extract better policies from suboptimal data obtained from DRPs’ rule-based strategies. Case studies are conducted to demonstrate that the proposed approach significantly improves DRPs’ profits and outperforms imitation learning, off-policy and Bayesian DRL in various environment settings. Our approach is also shown to be effective in handling different levels of uncertainties in load demands and electricity prices. Finally, our approach provides better pre-trained weights when transferred from offline to online to attain near-optimal strategies.
更多
查看译文
关键词
Demand response,deep reinforcement learning,offline learning,suboptimal data,uncertainty
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要