Chrome Extension
WeChat Mini Program
Use on ChatGLM

Glimpse and Zoom: Spatio-Temporal Focused Dynamic Network for Skeleton-based Action Recognition

IEEE Transactions on Circuits and Systems for Video Technology(2024)

Cited 0|Views14
No score
Abstract
GCN-based methods have achieved remarkable performance in skeleton-based action recognition. However, existing methods have not explicitly attempted to remove temporal and spatial redundancy that might introduce additional computational costs. Inspired by the fact that humans always tend to glimpse at overall motion and then zoom into the most important spatio-temporal regions, we propose a Spatio Temporal Focused Dynamic Network (STFD-Net) trained with reinforcement learning for skeleton-based action recognition. Specifically, we first propose a global extractor with Skeleton Pooling Module (SPM) to enable the network to focus on overall motion information with a refined skeleton structure. Then, a local extractor, containing pair-wise part partition, tubelet proposal network, and Partition-Grouped Module (PGM), is proposed to extract local motion details as a complement to the overall motion information. Finally, the dynamic classifier utilizes a recurrent neural network to dynamically terminate the process once the network is adequately confident. Extensive experiments have demonstrated that the proposed network achieves SOTA level performance with lower computational cost on the NTU 60 and NTU 120 dataset.
More
Translated text
Key words
Action Recognition,Skeleton Data,Dynamic Network,Reinforcement Learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined