Gaga: Group Any Gaussians via 3D-aware Memory Bank
arxiv(2024)
摘要
We introduce Gaga, a framework that reconstructs and segments open-world 3D
scenes by leveraging inconsistent 2D masks predicted by zero-shot segmentation
models. Contrasted to prior 3D scene segmentation approaches that heavily rely
on video object tracking, Gaga utilizes spatial information and effectively
associates object masks across diverse camera poses. By eliminating the
assumption of continuous view changes in training images, Gaga demonstrates
robustness to variations in camera poses, particularly beneficial for sparsely
sampled images, ensuring precise mask label consistency. Furthermore, Gaga
accommodates 2D segmentation masks from diverse sources and demonstrates robust
performance with different open-world zero-shot segmentation models, enhancing
its versatility. Extensive qualitative and quantitative evaluations demonstrate
that Gaga performs favorably against state-of-the-art methods, emphasizing its
potential for real-world applications such as scene understanding and
manipulation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要