Learning Asynchronous and Sparse Human-Object Interaction in Videos

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021(2021)

引用 31|浏览7666
暂无评分
摘要
Human activities can be learned from video. With effective modeling it is possible to discover not only the action labels but also the temporal structure of the activities, such as the progression of the sub-activities. Automatically recognizing such structure from raw video signal is a new capability that promises authentic modeling and successful recognition of human-object interactions. Toward this goal, we introduce Asynchronous-Sparse Interaction Graph Networks (ASSIGN), a recurrent graph network that is able to automatically detect the structure of interaction events associated with entities in a video scene. ASSIGN pioneers learning of autonomous behavior of video entities including their dynamic structure and their interaction with the coexisting neighbors. Entities' lives in our model are asynchronous to those of others therefore more flexible in adapting to complex scenarios. Their interactions are sparse in time hence more faithful to the true underlying nature and more robust in inference and learning. ASSIGN is tested on humanobject interaction recognition and shows superior performance in segmenting and labeling of human sub-activities and object affordances from raw videos. The native ability of ASSIGN in discovering temporal structure also eliminates the dependence on external segmentation that was previously mandatory for this task.
更多
查看译文
关键词
Asynchronous-Sparse Interaction Graph Networks,recurrent graph network,interaction events,video scene,ASSIGN pioneers,video entities,dynamic structure,inference,learning,human-object interaction recognition,segmenting,human sub-activities,object affordances,raw videos,temporal structure,Sparse human-object Interaction,human activities,effective modeling,action labels,raw video signal,authentic modeling,successful recognition,human-object interactions
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要