Fast generalized subset scan for anomalous pattern detection

Journal of Machine Learning Research(2013)

引用 129|浏览60
暂无评分
摘要
We propose Fast Generalized Subset Scan (FGSS), a new method for detecting anomalous patterns in general categorical data sets. We frame the pattern detection problem as a search over subsets of data records and attributes, maximizing a nonparametric scan statistic over all such subsets. We prove that the nonparametric scan statistics possess a novel property that allows for efficient optimization over the exponentially many subsets of the data without an exhaustive search, enabling FGSS to scale to massive and high-dimensional data sets. We evaluate the performance of FGSS in three real-world application domains (customs monitoring, disease surveillance, and network intrusion detection), and demonstrate that FGSS can successfully detect and characterize relevant patterns in each domain. As compared to three other recently proposed detection algorithms, FGSS substantially decreased run time and improved detection power for massive multivariate data sets.
更多
查看译文
关键词
exhaustive search,generalized subset,massive multivariate data set,general categorical data set,network intrusion detection,improved detection power,enabling fgss,anomalous pattern detection,detection algorithm,pattern detection problem,high-dimensional data set,data record,anomaly detection,bayesian networks,knowledge discovery
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要