Human-Scene Network: A novel baseline with self-rectifying loss for weakly supervised video anomaly detection

COMPUTER VISION AND IMAGE UNDERSTANDING(2024)

引用 0|浏览13
暂无评分
摘要
Video anomaly detection in surveillance systems with only video -level labels (i.e. weakly supervised) is challenging. This is due to (i) the complex integration of a large variety of scenarios including human and scene -based anomalies characterized by subtle or sharp spatio-temporal cues in real -world videos and (ii) non -optimal optimization between normal and anomaly instances under weak supervision. In this paper, we propose a Human -Scene Network to learn discriminative representations by capturing both subtle and strong cues in a dissociative manner. In addition, a self -rectifying loss is proposed that dynamically computes the pseudo -temporal annotations from video -level labels for optimizing the Human -Scene Network effectively. The proposed Human -Scene Network optimized with self -rectifying loss is validated on three publicly available datasets i.e. UCF-Crime, ShanghaiTech, and IITB-Corridor, outperforming recently reported state-of-the-art approaches on five out of the six scenarios considered.
更多
查看译文
关键词
Video anomaly detection,Weakly-supervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要