Chrome Extension
WeChat Mini Program
Use on ChatGLM

Spatiotemporal Orthogonal Projection Capsule Network for Incremental Few-Shot Action Recognition

IEEE Transactions on Multimedia(2024)

Cited 0|Views11
No score
Abstract
In this paper, we propose a new task named incremental few-shot action recognition (IFSAR), which aims to learn new action classes incrementally with limited samples. Existing few-shot class incremental learning methods are mainly designed for image datasets and cannot be directly applied to action recognition due to the complicated temporal evolution and spatial structure in videos. Besides, because of the incremental and fewshot setting, the catastrophic forgetting and overfitting problems are further intensified in the video domain. To address the above issues, we propose a spatiotemporal orthogonal projection capsule network (STOP), which employs a spatiotemporal attention routing mechanism and an orthogonal projection capsule layer for effective IFSAR. The former can effectively encode spatial and temporal transformation information and explore the action partwhole relationships to prevent catastrophic forgetting, while the latter is further designed to maintain a sufficient distance between the prototypes of old and novel classes to avoid overfitting by considering spatial-temporal features. Extensive experimental results demonstrate that the proposed method outperforms a series of state-of-the-art approaches on UCF-101, Kinetics-100, and HMDB-51 datasets.
More
Translated text
Key words
Incremental few-shot action recognition,capsule network,class-incremental learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined