DiffCAD: Weakly-Supervised Probabilistic CAD Model Retrieval and Alignment from an RGB Image
CoRR(2023)
摘要
Perceiving 3D structures from RGB images based on CAD model primitives can
enable an effective, efficient 3D object-based representation of scenes.
However, current approaches rely on supervision from expensive annotations of
CAD models associated with real images, and encounter challenges due to the
inherent ambiguities in the task -- both in depth-scale ambiguity in monocular
perception, as well as inexact matches of CAD database models to real
observations. We thus propose DiffCAD, the first weakly-supervised
probabilistic approach to CAD retrieval and alignment from an RGB image. We
formulate this as a conditional generative task, leveraging diffusion to learn
implicit probabilistic models capturing the shape, pose, and scale of CAD
objects in an image. This enables multi-hypothesis generation of different
plausible CAD reconstructions, requiring only a few hypotheses to characterize
ambiguities in depth/scale and inexact shape matches. Our approach is trained
only on synthetic data, leveraging monocular depth and mask estimates to enable
robust zero-shot adaptation to various real target domains. Despite being
trained solely on synthetic data, our multi-hypothesis approach can even
surpass the supervised state-of-the-art on the Scan2CAD dataset by 5.9% with 8
hypotheses.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要