CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion
NeurIPS(2023)
摘要
Controllable scene synthesis aims to create interactive environments for
various industrial use cases. Scene graphs provide a highly suitable interface
to facilitate these applications by abstracting the scene context in a compact
manner. Existing methods, reliant on retrieval from extensive databases or
pre-trained shape embeddings, often overlook scene-object and object-object
relationships, leading to inconsistent results due to their limited generation
capacity. To address this issue, we present CommonScenes, a fully generative
model that converts scene graphs into corresponding controllable 3D scenes,
which are semantically realistic and conform to commonsense. Our pipeline
consists of two branches, one predicting the overall scene layout via a
variational auto-encoder and the other generating compatible shapes via latent
diffusion, capturing global scene-object and local inter-object relationships
in the scene graph while preserving shape diversity. The generated scenes can
be manipulated by editing the input scene graph and sampling the noise in the
diffusion model. Due to lacking a scene graph dataset offering high-quality
object-level meshes with relations, we also construct SG-FRONT, enriching the
off-the-shelf indoor dataset 3D-FRONT with additional scene graph labels.
Extensive experiments are conducted on SG-FRONT where CommonScenes shows clear
advantages over other methods regarding generation consistency, quality, and
diversity. Codes and the dataset will be released upon acceptance.
更多查看译文
关键词
commonscenes,indoor
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要