CoFRIDA: Self-Supervised Fine-Tuning for Human-Robot Co-Painting
CoRR(2024)
摘要
Prior robot painting and drawing work, such as FRIDA, has focused on
decreasing the sim-to-real gap and expanding input modalities for users, but
the interaction with these systems generally exists only in the input stages.
To support interactive, human-robot collaborative painting, we introduce the
Collaborative FRIDA (CoFRIDA) robot painting framework, which can co-paint by
modifying and engaging with content already painted by a human collaborator. To
improve text-image alignment, FRIDA's major weakness, our system uses
pre-trained text-to-image models; however, pre-trained models in the context of
real-world co-painting do not perform well because they (1) do not understand
the constraints and abilities of the robot and (2) cannot perform co-painting
without making unrealistic edits to the canvas and overwriting content. We
propose a self-supervised fine-tuning procedure that can tackle both issues,
allowing the use of pre-trained state-of-the-art text-image alignment models
with robots to enable co-painting in the physical world. Our open-source
approach, CoFRIDA, creates paintings and drawings that match the input text
prompt more clearly than FRIDA, both from a blank canvas and one with human
created work. More generally, our fine-tuning procedure successfully encodes
the robot's constraints and abilities into a foundation model, showcasing
promising results as an effective method for reducing sim-to-real gaps.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要