HRNeXt: High-Resolution Context Network for Crowd Pose Estimation.

IEEE Trans. Multim.(2023)

引用 0|浏览0
暂无评分
摘要
Occlusion handling in crowded scenes is an intractable challenge for human pose estimation. To address this problem, we propose two novel feed-forward network structures named Global Feed-Forward Network (GFFN) and Dynamic Feed-Forward Network (DFFN), which are specifically designed for image-based tasks to capture both local and global contextual information within intermediate features and update feature representations with high adaptability for occlusions. By exploiting the context modeling ability of the proposed GFFN and DFFN, we present a novel backbone network, namely High-Resolution Context Network (HRNeXt), which learns high-resolution representations with abundant contextual information to better estimate poses of occluded human bodies. Compared to state-of-the-art pose estimation networks, our HRNeXt absorbs advantages of convolution operation and attention mechanism, and it is more efficient in terms of training data sizes, network parameters and computational costs. Experimental results show that our HRNeXt significantly outperforms state-of-the-art backbone networks on challenging pose estimation datasets with high occurrence of crowds and occlusions.
更多
查看译文
关键词
context,estimation,high-resolution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要