Hedge Your Actions: Flexible Reinforcement Learning for Complex Action Spaces

Norio Kosaka,Ayush Jain, Xinhu Li,Kyung-Min Kim,Joseph J Lim

ICLR 2023（2023）

引用 0|浏览19

暂无评分

摘要

Real-world decision-making is often associated with large and complex action representations, which can even be unsuited for the task. For instance, the items in recommender systems have generic representations that apply to each user differently, and the actuators of a household robot can be high-dimensional and noisy. Prior works in discrete and continuous action space reinforcement learning (RL) define a retrieval-selection framework to deal with problems of scale. The retrieval agent outputs in the space of action representations to retrieve a few samples for a selection critic to evaluate. But, learning such retrieval actors becomes increasingly inefficient as the complexity in the action space rises. Thus, we propose to treat the retrieval task as one of listwise RL to propose a list of action samples that enable the selection phase to maximize the environment reward. By hedging its action proposals, we show that our agent is more flexible and sample efficient than conventional approaches while learning under a complex action space. Results are also present on \url{https://sites.google.com/view/complexaction}.

查看译文

关键词

Efficient Reinforcement Learning,Large Action Space,Listwise Action Retrieval

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要