Bivariate probability-based anomaly detection

BESC(2014)

引用 0|浏览20
暂无评分
摘要
Statistical techniques play a crucial role in anomaly detection. Although they usually are simple and can be trained unsupervised, they face three challenges: parametric techniques usually rely on the assumption that the data meet a special distribution; existing Histogram-based techniques only take account of individual attribute, which cannot capture the interactions between different attributes; some statistical techniques still need labeled data for training or validation. In order to overcome these drawbacks, this paper proposes a different statistic method to justify the data instances. The proposed method, named Bivariate Probability based Anomaly Score (BPAS) algorithm, builds an ensemble of Bivariate Probability (BP) models for a given data set, and each model calculates the probability distribution for the combination of intervals from two attributes. The anomalies will be detected when they occur in these low probability combination. The empirical evaluation presents that BPAS works favorably to LOF, ORCA and ¡Forest on different types of real data sets in terms of AUC. Its performance is relative stable when key parameters changes. BPAS also performs well in categorical data sets and the data sets that contain normal instances only. Furthermore, it has a linear time complexity of 0(n), which is much lower than distance-based and density-based methods. Thus BPAS has potential ability to become an efficient anomaly detector for high volume and high dimensional databases.
更多
查看译文
关键词
computational complexity,database management systems,security of data,statistical analysis,statistical distributions,AUC,BPAS algorithm,LOF,ORCA,anomaly detector,bivariate probability based anomaly score algorithm,bivariate probability-based anomaly detection,categorical data sets,data instances,density-based methods,distance-based methods,high dimensional databases,histogram-based techniques,iForest,labeled data,linear time complexity,parametric techniques,probability distribution,statistic method,statistical techniques,Anomaly detection,BPAS,Bivariate Probability,iForest,
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要