Variational Diffusion Auto-encoder: Latent Space Extraction from Pre-trained Diffusion Models

Georgios Batzolis,Jan Stanczuk,Carola‐Bibiane Schönlieb

arXiv (Cornell University)（2023）

引用 0|浏览3

暂无评分

摘要

As a widely recognized approach to deep generative modeling, Variational Auto-Encoders (VAEs) still face challenges with the quality of generated images, often presenting noticeable blurriness. This issue stems from the unrealistic assumption that approximates the conditional data distribution, $p(\textbf{x} | \textbf{z})$, as an isotropic Gaussian. In this paper, we propose a novel solution to address these issues. We illustrate how one can extract a latent space from a pre-existing diffusion model by optimizing an encoder to maximize the marginal data log-likelihood. Furthermore, we demonstrate that a decoder can be analytically derived post encoder-training, employing the Bayes rule for scores. This leads to a VAE-esque deep latent variable model, which discards the need for Gaussian assumptions on $p(\textbf{x} | \textbf{z})$ or the training of a separate decoder network. Our method, which capitalizes on the strengths of pre-trained diffusion models and equips them with latent spaces, results in a significant enhancement to the performance of VAEs.

查看译文

关键词

latent space extraction,diffusion models,auto-encoder,pre-trained

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要