Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
arxiv(2024)
摘要
While Transformers have enabled tremendous progress in various application
settings, such architectures still lag behind traditional symbolic planners for
solving complex decision making tasks. In this work, we demonstrate how to
train Transformers to solve complex planning tasks and present Searchformer, a
Transformer model that optimally solves previously unseen Sokoban puzzles 93.7
of the time, while using up to 26.8
search. Searchformer is an encoder-decoder Transformer model trained to predict
the search dynamics of A^*. This model is then fine-tuned via expert
iterations to perform fewer search steps than A^* search while still
generating an optimal plan. In our training method, A^*'s search dynamics are
expressed as a token sequence outlining when task states are added and removed
into the search tree during symbolic planning. In our ablation studies on maze
navigation, we find that Searchformer significantly outperforms baselines that
predict the optimal plan directly with a 5-10× smaller model size and a
10× smaller training dataset. We also demonstrate how Searchformer
scales to larger and more complex decision making tasks like Sokoban with
improved percentage of solved tasks and shortened search dynamics.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要