EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
CoRR(2024)
摘要
Recent SOTA approaches for embodied learning via interaction directly employ
large language models (LLMs) as agents to determine the next steps in an
environment. Due to their world knowledge and reasoning capabilities, LLM
agents achieve stronger performance than previous smaller agents based on
reinforcement learning (RL); however, frequently calling LLMs is slow and
expensive. Instead of directly employing LLMs as agents, can we use LLMs'
reasoning capabilities to adaptively create training environments to help
smaller embodied RL agents learn useful skills that they are weak at? We
propose EnvGen, a novel framework to address this question. First, we prompt an
LLM to generate training environments that allow agents to quickly learn
different tasks in parallel. Concretely, the LLM is given the task description
and simulator objectives that the agents should learn and is then asked to
generate a set of environment configurations (e.g., different terrains, items
given to agents, etc.). Next, we train a small RL agent in a mixture of the
original and LLM-generated environments. Then, we enable the LLM to
continuously adapt the generated environments to progressively improve the
skills that the agent is weak at, by providing feedback to the LLM in the form
of the agent's performance. We demonstrate the usefulness of EnvGen with
comprehensive experiments in Crafter and Heist environments. We find that a
small RL agent trained with EnvGen can outperform SOTA methods, including a
GPT-4 agent, and learns long-horizon tasks significantly faster. We show
qualitatively how the LLM adapts training environments to help improve RL
agents' weaker skills over time. Additionally, EnvGen is substantially more
efficient as it only uses a small number of LLM calls (e.g., 4 in total),
whereas LLM agents require thousands of LLM calls. Lastly, we present detailed
ablation studies for our design choices.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要