Recurrent Action Transformer with Memory
arxiv(2023)
摘要
Recently, the use of transformers in offline reinforcement learning has
become a rapidly developing area. This is due to their ability to treat the
agent's trajectory in the environment as a sequence, thereby reducing the
policy learning problem to sequence modeling. In environments where the agent's
decisions depend on past events, it is essential to capture both the event
itself and the decision point in the context of the model. However, the
quadratic complexity of the attention mechanism limits the potential for
context expansion. One solution to this problem is to enhance transformers with
memory mechanisms. In this paper, we propose the Recurrent Action Transformer
with Memory (RATE) - a model that incorporates recurrent memory. To evaluate
our model, we conducted extensive experiments on both memory-intensive
environments (VizDoom-Two-Color, T-Maze) and classic Atari games and MuJoCo
control environments. The results show that the use of memory can significantly
improve performance in memory-intensive environments while maintaining or
improving results in classic environments. We hope that our findings will
stimulate research on memory mechanisms for transformers applicable to offline
reinforcement learning.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要