RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
CVPR 2024(2023)
摘要
Lifting 2D diffusion for 3D generation is a challenging problem due to the
lack of geometric prior and the complex entanglement of materials and lighting
in natural images. Existing methods have shown promise by first creating the
geometry through score-distillation sampling (SDS) applied to rendered surface
normals, followed by appearance modeling. However, relying on a 2D RGB
diffusion model to optimize surface normals is suboptimal due to the
distribution discrepancy between natural images and normals maps, leading to
instability in optimization. In this paper, recognizing that the normal and
depth information effectively describe scene geometry and be automatically
estimated from images, we propose to learn a generalizable Normal-Depth
diffusion model for 3D generation. We achieve this by training on the
large-scale LAION dataset together with the generalizable image-to-depth and
normal prior models. In an attempt to alleviate the mixed illumination effects
in the generated materials, we introduce an albedo diffusion model to impose
data-driven constraints on the albedo component. Our experiments show that when
integrated into existing text-to-3D pipelines, our models significantly enhance
the detail richness, achieving state-of-the-art results. Our project page is
https://lingtengqiu.github.io/RichDreamer/.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要