SPiForest: An Anomaly Detecting Algorithm Using Space Partition Constructed by Probability Density-Based Inverse Sampling.

IEEE transactions on neural networks and learning systems(2022)

引用 1|浏览21
暂无评分
摘要
The SPiForest, a new isolation-based approach to outlier detection, constructs iTrees on the space containing all attributes by probability density-based inverse sampling. Most existing iForest (iF)-based approaches can precisely and quickly detect outliers scattering around one or more normal clusters. However, the performance of these methods seriously decreases when facing outliers whose nature "few and different" disappears in subspace (e.g., anomalies surrounded by normal samples). To solve this problem, SPiForest is proposed, which is different from existing approaches. First, SPiForest uses the principal component analysis (PCA) to find principal components and estimate each component's probability density function (pdf). Second, SPiForest utilizes the inv-pdf, which is inversely proportional to the pdf estimated from the given dataset, to generate support points in the space containing all attributes. Third, the hyperplane decided by these support points is used to isolate the outliers in the space. Next, these steps are repeated to build an iTree. Finally, many iTrees construct a forest for outlier detection. SPiForest provides two benefits: 1) it isolates outliers with fewer hyperplanes, which significantly improves the accuracy and 2) it effectively detects the outliers whose nature "few and different" disappears in subspace. Comparative analyses and experiments show that the SPiForest achieves a significant improvement in terms of area under the curve (AUC) when compared with the state-of-the-art methods. Specifically, our method improves by at most 17.7% on AUC when compared to iF-based algorithms.
更多
查看译文
关键词
Anomaly detection,data mining,isolation forest,outlier detector
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要