AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
CVPR 2024(2024)
摘要
Traditional approaches in physics-based motion generation, centered around
imitation learning and reward shaping, often struggle to adapt to new
scenarios. To tackle this limitation, we propose AnySkill, a novel hierarchical
method that learns physically plausible interactions following open-vocabulary
instructions. Our approach begins by developing a set of atomic actions via a
low-level controller trained via imitation learning. Upon receiving an
open-vocabulary textual instruction, AnySkill employs a high-level policy that
selects and integrates these atomic actions to maximize the CLIP similarity
between the agent's rendered images and the text. An important feature of our
method is the use of image-based rewards for the high-level policy, which
allows the agent to learn interactions with objects without manual reward
engineering. We demonstrate AnySkill's capability to generate realistic and
natural motion sequences in response to unseen instructions of varying lengths,
marking it the first method capable of open-vocabulary physical skill learning
for interactive humanoid agents.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要