MIFNet: Multiple instances focused temporal action proposal generation

Neurocomputing(2023)

引用 0|浏览29
暂无评分
摘要
Temporal action proposal generation (TAPG) serves as a promising solution for video analysis. However, the performance of existing methods is still far from satisfactory for real-world applications. We attribute it to a crucial issue, i.e., hard multiple instances. In this paper, we investigate why this is the case. We discover that when processing multiple instances videos, mainstream approaches always recognize mul-tiple instances as one instance due to boundary ambiguity or ignoring insignificant backgrounds between these instances. To address this problem, we propose a Multiple Instances Focused Network(MIFNet) that improves the quality of action proposals by considering boundary correlations and fusing multi-scale proposals. In particular, we first propose a pure boundary embedding module named Boundary Constraint Module (BCM) for suppressing the generation of hard negatives proposal by evaluating bound-ary correlation. The BCM introduces a boundary contrastive learning strategy that can pull the positive boundary pairs' representation closer and push the negative pairs' representation away. Then, a Proposal Blending Module (PBM) is proposed, which augments the proposal-level representation by mod-eling information among multi-scale proposals so that proposals can be complemented with local details as well as global information. The experimental results on the ActivityNet-v1.3 and THUMOS14 bench-marks demonstrate that MIFNet outperforms the state-of-the-arts.(c) 2023 Published by Elsevier B.V.
更多
查看译文
关键词
Video understanding,Temporal action proposal,Temporal action detection,Contrastive learning,Multiple instances
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要