Learning Space-Time Semantic Correspondences

CoRR（2023）

引用 0|浏览57

暂无评分

摘要

We propose a new task of space-time semantic correspondence prediction in videos. Given a source video, a target video, and a set of space-time key-points in the source video, the task requires predicting a set of keypoints in the target video that are the semantic correspondences of the provided source keypoints. We believe that this task is important for fine-grain video understanding, potentially enabling applications such as activity coaching, sports analysis, robot imitation learning, and more. Our contributions in this paper are: (i) proposing a new task and providing annotations for space-time semantic correspondences on two existing benchmarks: Penn Action and Pouring; and (ii) presenting a comprehensive set of baselines and experiments to gain insights about the new problem. Our main finding is that the space-time semantic correspondence prediction problem is best approached jointly in space and time rather than in their decomposed sub-problems: time alignment and spatial correspondences.

查看译文

关键词

learning,space-time

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要