Reinforcement Learning With Converging Goal Space And Binary Reward Function

Wooseok Rol,Wonseok Jeon,Hamid Bamshad,Hyunseok Yang

2020 IEEE 16TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE)（2020）

引用 2|浏览23

暂无评分

摘要

Usage of a sparse and binary reward function remains one of the most challenging problems in reinforcement learning. In particular, when the environments wherein robotic agents learn are sufficiently vast, it is much more difficult to learn tasks because the probability of reaching the goal is minimal. A Hindsight Experience Replay algorithm was proposed to overcome these difficulties; however, problems persist that affect the learning speed and delay learning when a learning agent cannot receive proper rewards at the beginning of the learning process. In this paper, we present a simple method called Converging Goal Space and Binary Reward Function, which helps agents learn tasks easily and efficiently in large environments while providing a binary reward. At an early stage in training, a larger goal space margin facilitates the reward function for a more rapid policy learning. As the number of successes increases, the goal space is gradually reduced to the size used to the size used in the test. We apply this reward function to two different task experiments: sliding and throwing, which must be explored at a wider range than the reach of the robotic arms, and then compare the learning efficiency to that of experiments that only employ a sparse and binary reward function. We show that the proposed reward function performs better in large environments using physics simulation, and we demonstrate that the function is applicable to real world robotic arms.

查看译文

关键词

Hindsight Experience Replay algorithm,learning speed,delay learning,learning agent,rapid policy learning,goal space margin,converging goal space,robotic agents,sparse reward function,binary reward function,reinforcement learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要