VideoLSTM convolves, attends and flows for action recognition

Computer Vision and Image Understanding(2018)

引用 555|浏览187
暂无评分
摘要
•To exploit both the spatial and temporal correlations in a video, we hardwire convolutions in the soft-Attention LSTM architecture.•We introduce motion-based attention which guides better the attention towards the relevant spatial-temporal locations of the actions.•We demonstrate how the attention generated from our VideoLSTM can be used for action localization by relying on the action class label only.•We show the theoretical as well as practical merits of our VideoLSTM against other LSTM architectures for action classification and localization.
更多
查看译文
关键词
Action recognition,Video representation,Attention,LSTM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要