Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models
arxiv(2024)
摘要
Video Anomaly Detection (VAD) is crucial for applications such as security
surveillance and autonomous driving. However, existing VAD methods provide
little rationale behind detection, hindering public trust in real-world
deployments. In this paper, we approach VAD with a reasoning framework.
Although Large Language Models (LLMs) have shown revolutionary reasoning
ability, we find that their direct use falls short of VAD. Specifically, the
implicit knowledge pre-trained in LLMs focuses on general context and thus may
not apply to every specific real-world VAD scenario, leading to inflexibility
and inaccuracy. To address this, we propose AnomalyRuler, a novel rule-based
reasoning framework for VAD with LLMs. AnomalyRuler comprises two main stages:
induction and deduction. In the induction stage, the LLM is fed with few-shot
normal reference samples and then summarizes these normal patterns to induce a
set of rules for detecting anomalies. The deduction stage follows the induced
rules to spot anomalous frames in test videos. Additionally, we design rule
aggregation, perception smoothing, and robust reasoning strategies to further
enhance AnomalyRuler's robustness. AnomalyRuler is the first reasoning approach
for the one-class VAD task, which requires only few-normal-shot prompting
without the need for full-shot training, thereby enabling fast adaption to
various VAD scenarios. Comprehensive experiments across four VAD benchmarks
demonstrate AnomalyRuler's state-of-the-art detection performance and reasoning
ability.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要