谷歌浏览器插件
订阅小程序
在清言上使用

Dynamic Context Removal: A General Training Strategy for Robust Models on Video Action Predictive Tasks

International journal of computer vision(2023)

引用 1|浏览26
暂无评分
摘要
Predicting future actions is an essential feature of intelligent systems and embodied AI. However, compared to the traditional recognition tasks, the uncertainty of the future and the reasoning ability requirement make prediction tasks very challenging and far beyond solved. In this field, previous methods usually care more about the model architecture design but little attention has been put on how to train models with a proper learning policy. To this end, in this work, we propose a simple but effective training strategy, Dynamic Context Removal (DCR), which dynamically schedules the visibility of context in different training stages. It follows the human-like curriculum learning process, i.e., gradually removing the event context to increase the prediction difficulty till satisfying the final prediction target. Besides, we explore how to train robust models that give consistent predictions at different levels of observable context. Our learning scheme is plug-and-play and easy to integrate widely-used reasoning models including Transformer and LSTM, with advantages in both effectiveness and efficiency. We study two action prediction problems, i.e., Video Action Anticipation and Early Action Recognition. In extensive experiments, our method achieves state-of-the-art results on several widely-used benchmarks.
更多
查看译文
关键词
Dynamic Context Removal,Video Action Anticipation,Early Action Recognition,Robustness
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要