EG4D: Explicit Generation of 4D Object without Score Distillation
CoRR(2024)
Abstract
In recent years, the increasing demand for dynamic 3D assets in design and
gaming applications has given rise to powerful generative pipelines capable of
synthesizing high-quality 4D objects. Previous methods generally rely on score
distillation sampling (SDS) algorithm to infer the unseen views and motion of
4D objects, thus leading to unsatisfactory results with defects like
over-saturation and Janus problem. Therefore, inspired by recent progress of
video diffusion models, we propose to optimize a 4D representation by
explicitly generating multi-view videos from one input image. However, it is
far from trivial to handle practical challenges faced by such a pipeline,
including dramatic temporal inconsistency, inter-frame geometry and texture
diversity, and semantic defects brought by video generation results. To address
these issues, we propose DG4D, a novel multi-stage framework that generates
high-quality and consistent 4D assets without score distillation. Specifically,
collaborative techniques and solutions are developed, including an attention
injection strategy to synthesize temporal-consistent multi-view videos, a
robust and efficient dynamic reconstruction method based on Gaussian Splatting,
and a refinement stage with diffusion prior for semantic restoration. The
qualitative results and user preference study demonstrate that our framework
outperforms the baselines in generation quality by a considerable margin. Code
will be released at .
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined