Learning Space-Time Semantic Correspondences

CoRR(2023)

引用 0|浏览57
暂无评分
摘要
We propose a new task of space-time semantic correspondence prediction in videos. Given a source video, a target video, and a set of space-time key-points in the source video, the task requires predicting a set of keypoints in the target video that are the semantic correspondences of the provided source keypoints. We believe that this task is important for fine-grain video understanding, potentially enabling applications such as activity coaching, sports analysis, robot imitation learning, and more. Our contributions in this paper are: (i) proposing a new task and providing annotations for space-time semantic correspondences on two existing benchmarks: Penn Action and Pouring; and (ii) presenting a comprehensive set of baselines and experiments to gain insights about the new problem. Our main finding is that the space-time semantic correspondence prediction problem is best approached jointly in space and time rather than in their decomposed sub-problems: time alignment and spatial correspondences.
更多
查看译文
关键词
learning,space-time
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要