Weakly-supervised Temporal Action Localization with Adaptive Clustering and Refining Network

ICME(2023)

引用 0|浏览20
暂无评分
摘要
Weakly-supervised temporal action localization task aims to localize temporal boundaries of action instances by using only video-level labels. Existing methods primarily adopt Multi-Instance-Learning (MIL) scheme to handle this task. The effectiveness of MIL scheme depends heavily on the selection of top-k action snippets, which is unstable and requires manual tuning. To address these deficiencies, we propose an Adaptive Clustering and Refining Network (ACRNet). Specifically, we present an action-aware clustering strategy that is adaptable and requires no manual tuning to separate action and background snippets of diverse videos based on intra-class activation distribution. And a cluster refining step is included to eliminate false action snippets by considering inter-class activation distribution, which greatly improves robustness and localization accuracy. Extensive experiments on THUMOS14, ActivityNet 1.2&1.3 benchmarks show that our method achieves state-of-the-art performance.
更多
查看译文
关键词
Temporal Action Localization,Weakly-supervised Learning,Adaptive Clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要