Density estimation based soft actor-critic: deep reinforcement learning for static output feedback control with measurement noise

ADVANCED ROBOTICS(2024)

引用 0|浏览1
暂无评分
摘要
The state-of-the-art deep reinforcement learning (DRL) methods, including Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), Proximal Policy Optimization (PPO), Soft Actor-Critic (SAC), among others, demonstrate significant capability in solving the optimal static state feedback control (SSFC) problem. This problem can be modeled as a fully observed Markov decision process (MDP). However, the optimal static output feedback control (SOFC) problem with measurement noise is a typical partially observable MDP (POMDP), which is difficult to solve, especially for the continuous state-action-observation space with high dimensions. This paper proposes a two-stage framework to address this challenge. In the laboratory stage, both the states and the noisy outputs are observable; the SOFC policy is converted to a constrained stochastic SSFC policy, of which the probability density function is generally not analytical. To this end, a density estimation based SAC algorithm is proposed to explore the optimal SOFC policy by learning the optimal constrained stochastic SSFC. Consequently, in the real-world stage, only the noisy outputs and the learned SOFC policy are required to solve the optimal SOFC problem. Numerical simulations and the corresponding experiments with robotic arms are provided to illustrate the effectiveness of our method. The code is available at https://github.com/RanKyoto/DE-SAC.
更多
查看译文
关键词
Deep reinforcement learning,static output feedback control,density estimation,soft actor-critic,robotic arm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要