Amodal Ground Truth and Completion in the Wild
CoRR(2023)
摘要
The problem we study in this paper is amodal image segmentation: predicting
entire object segmentation masks including both visible and invisible
(occluded) parts. In previous work, the amodal segmentation ground truth on
real images is usually predicted by manual annotaton and thus is subjective. In
contrast, we use 3D data to establish an automatic pipeline to determine
authentic ground truth amodal masks for partially occluded objects in real
images. This pipeline is used to construct an amodal completion evaluation
benchmark, MP3D-Amodal, consisting of a variety of object categories and
labels. To better handle the amodal completion task in the wild, we explore two
architecture variants: a two-stage model that first infers the occluder,
followed by amodal mask completion; and a one-stage model that exploits the
representation power of Stable Diffusion for amodal segmentation across many
categories. Without bells and whistles, our method achieves a new
state-of-the-art performance on Amodal segmentation datasets that cover a large
variety of objects, including COCOA and our new MP3D-Amodal dataset. The
dataset, model, and code are available at
https://www.robots.ox.ac.uk/ vgg/research/amodal/.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要