Augmented-Ensemble TD3: Overcoming the Shackles of Constant Action Delay
2023 23rd International Conference on Control, Automation and Systems (ICCAS)(2023)
Abstract
Reinforcement Learning has experienced significant advances in various domains. However, delayed feedback in RL environments poses challenges due to the violation of the Markovian property. In this paper, we propose an approach to address the issues of Markov Decision Process(MDP) with delayed feedback. The proposed approach, called “Augmented-Ensemble Twin-Delayed Deep Deterministic Policy Gradient(TD3),” aims to mitigate the performance degradation caused by delayed feedback.
MoreTranslated text
Key words
Reinforcement learning,Environment with delayed feedback,Augmented approach,Ensemble,TD3
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined