FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators
arxiv(2023)
摘要
Matching cross-modality features between images and point clouds is a
fundamental problem for image-to-point cloud registration. However, due to the
modality difference between images and points, it is difficult to learn robust
and discriminative cross-modality features by existing metric learning methods
for feature matching. Instead of applying metric learning on cross-modality
data, we propose to unify the modality between images and point clouds by
pretrained large-scale models first, and then establish robust correspondence
within the same modality. We show that the intermediate features, called
diffusion features, extracted by depth-to-image diffusion models are
semantically consistent between images and point clouds, which enables the
building of coarse but robust cross-modality correspondences. We further
extract geometric features on depth maps produced by the monocular depth
estimator. By matching such geometric features, we significantly improve the
accuracy of the coarse correspondences produced by diffusion features.
Extensive experiments demonstrate that without any task-specific training,
direct utilization of both features produces accurate image-to-point cloud
registration. On three public indoor and outdoor benchmarks, the proposed
method averagely achieves a 20.6 percent improvement in Inlier Ratio, a
three-fold higher Inlier Number, and a 48.6 percent improvement in Registration
Recall than existing state-of-the-arts.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要