Exploring Pooling Strategies based on Idiosyncrasies of Spatio-Temporal Interest Points

ICMR(2015)

引用 2|浏览15
暂无评分
摘要
Recent studies have demonstrated that the implementation of local space-time interest points has good competence and robustness in the area of human action recognition, which has become one of the challenging problems in multimedia analysis. While most research focuses on the techniques of detecting feature points or capturing spatial and temporal information around those points, there has been very limited research on delving into the pooling strategies which are also important components of action recognition algorithms. In this paper, we propose a novel pooling framework by categorizing the interest points with respect to their idiosyncrasies. Specifically, we discuss three pooling strategies based on the optical flow orientation, foreground weight and spatio-temporal locations respectively and further investigate the fusion of different pooling strategies. For the encoding process, instead of the popular bag-of-visual words (BoV) method, we adopt the improved Fisher Vector (FV) approach. Our proposed methods are evaluated on a benchmark dataset with controlled settings (KTH), and two more challenging datasets with realistic background (HMDB51 and UCF101). The experimental results demonstrate that pooling strategies based on the appropriate idiosyncrasies of individual interest points can improve the performance of action classification.
更多
查看译文
关键词
Action recognition, local interest points, pooling strategy, spatio-temporal pyramid, optical flow, foreground weight
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要