DeepActsNet: A deep ensemble framework combining features from face, hands, and body for action recognition.

Pattern Recognit.(2023)

引用 2|浏览7
暂无评分
摘要
Human action recognition from videos has gained substantial focus due to its wide applications in the field of video understanding. Most of the existing approaches extract human skeleton data from videos to encode actions because of the invariance nature of the skeleton information with respect to lightning con-ditions and background changes. Despite their success in achieving high recognition accuracy, methods based on limited body joints fail to capture the nuances of subtle body parts which are highly relevant for discriminating similar actions. In this paper, we overcome this limitation by presenting a holistic frame-work for combining spatial and motion features from the body, face, and hands to develop a novel data representation termed "Deep Actions Stamps (DeepActs)" for video-based action recognition. Compared to the skeleton sequences based on limited body joints, DeepActs encode more effective spatio-temporal features that provide robustness against pose estimation noises and improve action recognition accuracy. We also present "DeepActsNet", a deep learning based ensemble model which learns convolutional and structural features from Deep Action Stamps for highly accurate action recognition. Experiments on three challenging action recognition datasets (NTU60, NTU120, and SYSU) show that the proposed model pro-duces significant improvements in the action recognition accuracy with less computational cost compared to the state-of-the-art methods.(c) 2023 Elsevier Ltd. All rights reserved.
更多
查看译文
关键词
deepactsnet ensemble framework,action,hands,recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要