ToddlerDiffusion: Flash Interpretable Controllable Diffusion Model.
CoRR(2023)
Abstract
Diffusion-based generative models excel in perceptually impressive synthesis
but face challenges in interpretability. This paper introduces
ToddlerDiffusion, an interpretable 2D diffusion image-synthesis framework
inspired by the human generation system. Unlike traditional diffusion models
with opaque denoising steps, our approach decomposes the generation process
into simpler, interpretable stages; generating contours, a palette, and a
detailed colored image. This not only enhances overall performance but also
enables robust editing and interaction capabilities. Each stage is meticulously
formulated for efficiency and accuracy, surpassing Stable-Diffusion (LDM).
Extensive experiments on datasets like LSUN-Churches and COCO validate our
approach, consistently outperforming existing methods. ToddlerDiffusion
achieves notable efficiency, matching LDM performance on LSUN-Churches while
operating three times faster with a 3.76 times smaller architecture. Our source
code is provided in the supplementary material and will be publicly accessible.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined