QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding
CoRR(2024)
摘要
Understanding the structural organisation of 3D indoor scenes in terms of
rooms is often accomplished via floorplan extraction. Robotic tasks such as
planning and navigation require a semantic understanding of the scene as well.
This is typically achieved via object-level semantic segmentation. However,
such methods struggle to segment out topological regions like "kitchen" in the
scene. In this work, we introduce a two-step pipeline. First, we extract a
topological map, i.e., floorplan of the indoor scene using a novel
multi-channel occupancy representation. Then, we generate CLIP-aligned features
and semantic labels for every room instance based on the objects it contains
using a self-attention transformer. Our language-topology alignment supports
natural language querying, e.g., a "place to cook" locates the "kitchen". We
outperform the current state-of-the-art on room segmentation by 20
classification by 12
provide insights into the problem of joint structural and semantic 3D scene
understanding.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要