HIGSA: Human image generation with self-attention

Advanced Engineering Informatics(2023)

引用 0|浏览40
暂无评分
摘要
The goal of human image generation (HIG) is to synthesize a human image in a novel pose. HIG can potentially benefit various computer vision applications and engineering tasks. The recently-developed CNN-based approach applies the attention architecture to vision tasks. However, owing to the locality in CNNs, extracting and maintaining the long-range pixel interactions input images is difficult. Thus, existing human image generation methods face limited content representation. In this paper, we propose a novel human image generation framework called HIGSA that can utilize the position information from the input source image. The proposed HIGSA contains two complementary self-attention blocks to generate photo-realistic human images, named as stripe self-attention block (SSAB) and content attention block (CAB), respectively. In SSAB, this paper establishes global dependencies of human images and computes the attention map for each pixel based on its relative spatial positions concerning other pixels. In CAB, this paper introduces an effective feature extraction module to interactively enhance both person’s appearance and shape feature representations. Therefore, the HIGSA framework inherently preserves the better appearance consistency and shape consistency with sharper details. Extensive experiments on mainstream datasets demonstrate that HIGSA achieves the state-of-the-art (SOTA) results.
更多
查看译文
关键词
Deep learning,GAN,Human image generation,Attention
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要