EquivAct: SIM(3)-Equivariant Visuomotor Policies beyond Rigid Object Manipulation
arxiv(2023)
摘要
If a robot masters folding a kitchen towel, we would expect it to master
folding a large beach towel. However, existing policy learning methods that
rely on data augmentation still don't guarantee such generalization. Our
insight is to add equivariance to both the visual object representation and
policy architecture. We propose EquivAct which utilizes SIM(3)-equivariant
network structures that guarantee generalization across all possible object
translations, 3D rotations, and scales by construction. EquivAct is trained in
two phases. We first pre-train a SIM(3)-equivariant visual representation on
simulated scene point clouds. Then, we learn a SIM(3)-equivariant visuomotor
policy using a small amount of source task demonstrations. We show that the
learned policy directly transfers to objects that substantially differ from
demonstrations in scale, position, and orientation. We evaluate our method in
three manipulation tasks involving deformable and articulated objects, going
beyond typical rigid object manipulation tasks considered in prior work. We
conduct experiments both in simulation and in reality. For real robot
experiments, our method uses 20 human demonstrations of a tabletop task and
transfers zero-shot to a mobile manipulation task in a much larger setup.
Experiments confirm that our contrastive pre-training procedure and equivariant
architecture offer significant improvements over prior work. Project website:
https://equivact.github.io
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要