E^2GAN: Efficient Training of Efficient GANs for Image-to-Image Translation
CoRR(2024)
摘要
One highly promising direction for enabling flexible real-time on-device
image editing is utilizing data distillation by leveraging large-scale
text-to-image diffusion models, such as Stable Diffusion, to generate paired
datasets used for training generative adversarial networks (GANs). This
approach notably alleviates the stringent requirements typically imposed by
high-end commercial GPUs for performing image editing with diffusion models.
However, unlike text-to-image diffusion models, each distilled GAN is
specialized for a specific image editing task, necessitating costly training
efforts to obtain models for various concepts. In this work, we introduce and
address a novel research direction: can the process of distilling GANs from
diffusion models be made significantly more efficient? To achieve this goal, we
propose a series of innovative techniques. First, we construct a base GAN model
with generalized features, adaptable to different concepts through fine-tuning,
eliminating the need for training from scratch. Second, we identify crucial
layers within the base GAN model and employ Low-Rank Adaptation (LoRA) with a
simple yet effective rank search process, rather than fine-tuning the entire
base model. Third, we investigate the minimal amount of data necessary for
fine-tuning, further reducing the overall training time. Extensive experiments
show that we can efficiently empower GANs with the ability to perform real-time
high-quality image editing on mobile devices with remarkable reduced training
cost and storage for each concept.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要