Paste and Harmonize via Denoising: Subject-Driven Image Editing with Frozen Pre-Trained Diffusion Model

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览2
暂无评分
摘要
Text-to-Image generative models have shown a remarkable ability to produce high-quality images. However, existing methods still face difficulties in exemplar-guided image editing without destroying the given objects’ identity in the exemplar image. To address this problem, we propose a new framework called Paste and Harmonize via Denoising, which leverages pre-trained diffusion models to facilitate the text-driven transfer of objects from an exemplar image to the edited image while preserving their appearance and characteristics. The framework consists of two main steps: paste and harmonize via denoising. In the paste step, an off-the-shelf text-driven model is utilized to localize the objects in the exemplar image. The editing task is naturally transformed into an image harmonization task by pasting the object patches into the edited image. In the harmonize via denoising step, we introduce an image harmonization module based on pre-trained diffusion models to blend the inserted object with the target image, producing a coherent and realistic image without compromising synthesis quality and preserving the text-driven style transfer editing ability. In the experiments, the qualitative comparisons with baselines demonstrate that our method achieves impressive performance in exemplar-based image editing on both training and in-the-wild images with high fidelity. More qualitative and quantitative results can be found at our website.
更多
查看译文
关键词
Deep Generative Model,Diffusion Model,Subject-driven Image Editing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要