Unsupervised Learning for 3 D Reconstruction and Blocks World Representation

semanticscholar(2019)

引用 0|浏览0
暂无评分
摘要
Recovering the dense 3D structure of a scene from its images has been a long-standing goal in computer vision. Recent years have seen attempts of encoding richer priors into the geometry-based pipelines with the introduction of learning based methods. We argue that the form of 3D supervision required by such methods is too onerous, is not naturally available, and it is therefore of both practical and scientific interest to pursue solutions that do not rely on such 3D supervision. In this thesis, we attempt to bridge the worlds of geometric modeling and deep learning – how to use geometric constraints for obtaining supervisory signal for the task of reconstructing and representing the 3D world efficiently. We first present an unsupervised learning based approach for 3D reconstruction, based on a novel robust photometric consistency objective, the output of which is a 3D point cloud. When trained with our proposed learning objective, deep multi-view stereo models produce significantly better 3D reconstructions. The proposed objective allows implicitly overcoming lighting changes and occlusions across multiple views. In order to represent the reconstructions efficiently, we draw inspiration from Larry Roberts’ famous Blocks World of 1965. We introduce a deep learning framework that enables representing 3D point clouds as an assembly of blocks giving way to a lightweight representation with a several orders of magnitude reduction in memory. We describe how geometric relationships between points and surfaces along with physical priors can be utilized to provide supervisory signal for training deep models. We also present a synthetic-to-real transfer learning setup with a differentiable matching loss that facilitates supervised learning of such blocks world representations.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要