Swin-UNIT: Transformer-based GAN for High-resolution Unpaired Image Translation

Yifan Li,Yaochen Li,Wenneng Tang, Zhifeng Zhu, Jinhuo Yang,Yuehu Liu

MM '23: Proceedings of the 31st ACM International Conference on Multimedia(2023)

引用 0|浏览15
暂无评分
摘要
The transformer model has gained a lot of success in various computer vision tasks owing to its capacity of modeling long-range dependencies. However, its application has been limited in the area of high-resolution unpaired image translation using GANs due to the quadratic complexity with the spatial resolution of input features. In this paper, we propose a novel transformer-based GAN for high-resolution unpaired image translation named Swin-UNIT. A two-stage generator is designed which consists of a global style translation (GST) module and a recurrent detail supplement (RDS) module. The GST module focuses on translating low-resolution global features using the ability of self-attention. The RDS module offers quick information propagation from the global features to the detail features at a high resolution using cross-attention. Moreover, we customize a dual-branch discriminator to guide the generator. Extensive experiments demonstrate that our model achieves state-of-the-art results on the unpaired image translation tasks.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要