Modeling Action Spatiotemporal Relationships using Graph-based Class-level Attention Network for Long-term Action Detection

Yuankai Wu, Xin Su,Driton Salihu,Hao Xing,Marsil Zakour, Constantin Patsch

2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)(2023)

引用 0|浏览0
暂无评分
摘要
In recent years, Action Detection has become an active research topic in various fields such as human-robot interaction and assistive robots. Most of the previous methods in this field focus on temporally processing the action representation, without considering the dependencies among the action classes. However, actions that occur in a video are constantly related, and this correlation could offer effective clues for detection tasks. In this work, we propose to exploit the information of related action classes with the help of a graph neural network in conjunction with temporal modeling. We introduce the attention-based temporal class module (ATC), which models the inherent action dependencies on the graph and learns action-specific features among temporal dimensions with a dual-branch attention mechanism. Further, we present the Graph-based Class-level Attention Network (GCAN), which is built upon ATC modules with increasing temporal receptive fields to handle actions instances in complex untrimmed videos. Our network is evaluated on two challenging benchmark datasets with dense annotations: Charades and MultiTHUMOS. Experimental results show that our approach demonstrates highly competitive results with a significantly reduced model complexity.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要