CIPS-3D++: End-to-End Real-Time High-Resolution 3D-Aware GANs for GAN Inversion and Stylization.

IEEE transactions on pattern analysis and machine intelligence(2023)

引用 1|浏览72
暂无评分
摘要
Style-based GANs achieve state-of-the-art results for generating high-quality images, but lack explicit and precise control over camera poses. Recently proposed NeRF-based GANs have made great progress towards 3D-aware image generation. However, the methods either rely on convolution operators which are not rotationally invariant, or utilize complex yet suboptimal training procedures to integrate both NeRF and CNN sub-structures, yielding un-robust, low-quality images with a large computational burden. On top of our open-source CIPS-3D framework  https://github.com/PeterouZh/CIPS-3D, this paper presents an upgraded version called CIPS-3D++, aiming at high-robust, high-resolution and high-efficiency 3D-aware GANs. On the one hand, our basic model CIPS-3D, encapsulated in a style-based architecture, features a shallow NeRF-based 3D shape encoder as well as a deep MLP-based 2D image decoder, achieving robust image generation/editing with rotation-invariance. On the other hand, our proposed CIPS-3D++, inheriting the rotational invariance of CIPS-3D, together with geometric regularization and upsampling operations, encourages high-resolution high-quality image generation/editing with great computational efficiency. Trained on raw single-view images, without any bells and whistles, CIPS-3D++ sets new records for 3D-aware image synthesis, with an impressive FID of 3.2 on FFHQ at the 1024×1024 resolution. In the meantime, CIPS-3D++ runs efficiently and enjoys a low GPU memory footprint so that it can be trained end-to-end on high-resolution images directly, in contrast to previous alternate/progressive methods. Based on the infrastructure of CIPS-3D++, we propose a 3D-aware GAN inversion algorithm named FlipInversion, which can reconstruct the 3D object from a single-view image. We also provide a 3D-aware stylization method for real images based on CIPS-3D++ and FlipInversion. In addition, we analyze the problem of mirror symmetry suffered in training, and solve it by introducing an auxiliary discriminator for the NeRF network. Overall, CIPS-3D++ provides a strong base model that can serve as a testbed for transferring GAN-based image editing methods from 2D to 3D. Our open-source project as well as accompanying demo videos can be found online  https://github.com/PeterouZh/CIPS-3Dplusplus.
更多
查看译文
关键词
gans inversion,end-to-end,real-time,high-resolution,d-aware
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要