Scaling Population-Based Reinforcement Learning with GPU Accelerated Simulation
arxiv(2024)
摘要
In recent years, deep reinforcement learning (RL) has shown its effectiveness
in solving complex continuous control tasks like locomotion and dexterous
manipulation. However, this comes at the cost of an enormous amount of
experience required for training, exacerbated by the sensitivity of learning
efficiency and the policy performance to hyperparameter selection, which often
requires numerous trials of time-consuming experiments. This work introduces a
Population-Based Reinforcement Learning (PBRL) approach that exploits a
GPU-accelerated physics simulator to enhance the exploration capabilities of RL
by concurrently training multiple policies in parallel. The PBRL framework is
applied to three state-of-the-art RL algorithms -- PPO, SAC, and DDPG --
dynamically adjusting hyperparameters based on the performance of learning
agents. The experiments are performed on four challenging tasks in Isaac Gym --
Anymal Terrain, Shadow Hand, Humanoid, Franka Nut Pick -- by analyzing the
effect of population size and mutation mechanisms for hyperparameters. The
results show that PBRL agents achieve superior performance, in terms of
cumulative reward, compared to non-evolutionary baseline agents. The trained
agents are finally deployed in the real world for a Franka Nut Pick} task,
demonstrating successful sim-to-real transfer. Code and videos of the learned
policies are available on our project website.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要