Unsupervised Video-Based Action Recognition With Imagining Motion and Perceiving Appearance

IEEE Transactions on Circuits and Systems for Video Technology(2023)

引用 0|浏览40
暂无评分
摘要
Video-based action recognition is a challenging task, which demands carefully considering the temporal property of videos in addition to the appearance attributes. Particularly, the temporal domain of raw videos usually contains significantly more redundant or irrelevant information than still images. For that, this paper proposes an unsupervised video-based action recognition approach with imagining motion and perceiving appearance, called IMPA, by comprehensively learning the spatio-temporal characteristics inherited in videos, with a particular emphasis on the moving object for action recognition. Specifically, a self-supervised Motion Extracting Block (MEB) is designed to extract the principal motion features by focusing on the large movement of the moving object, based on the observation that humans can infer complete motion trajectories from partial moving objects. To further take the indispensable appearance attribute in videos into account, an unsupervised Appearance Learning Block (ALB) is developed to perceive the static appearance, thus in combination with the MEB to recognize actions. Extensive validation experiments and ablation studies on multiple datasets demonstrate that our proposed IMPA approach obtains superior performance and surpasses other classical and state-of-the-art unsupervised action recognition methods.
更多
查看译文
关键词
Action recognition,unsupervised,imagine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要