谷歌浏览器插件
订阅小程序
在清言上使用

Hierarchical reinforcement learning with unlimited option scheduling for sparse rewards in continuous spaces

EXPERT SYSTEMS WITH APPLICATIONS(2024)

引用 0|浏览13
暂无评分
摘要
The fundamental concept behind option-based hierarchical reinforcement learning (O-HRL) is to obtain temporal coarse-grained actions and abstract complex situations. Although O-HRL is intended for sparse rewards, it remains difficult to extend it to sparse reward problems in continuous spaces. In this paper, we provide a fresh perspective on option technology to comprehend different options based on knowledge representation. The hierarchical reinforcement learning with the unlimited option scheduling (UOS) algorithm is proposed. Unlike conventional O-HRL algorithms that apply a limited set of options with specific meanings, UOS encourages an infinite number of options to correlate with trajectories while maintaining a correlation with each other, thus representing more abundant knowledge. These unlimited options can guide infinite and diverse trajectories to cover fine-grained state spaces. Further, a composite scheduling mode is proposed to generate arbitrary-length trajectories with intrinsic characteristics, providing both flexibility and concentration for unlimited options. It significantly improves the performance and robustness of UOS. Finally, a new comprehensive experimental system is developed, and the experimental results demonstrate the notable success of UOS on sparse reward tasks in continuous spaces. It also identifies the root cause of UOS superiority from the perspective of knowledge representation.
更多
查看译文
关键词
Hierarchical reinforcement learning,Temporal abstraction,Sparse reward,Option scheduling,Knowledge representation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要