SubtStream: Online subtractive stream clustering algorithm

Concurrency and Computation: Practice and Experience(2022)

引用 1|浏览0
暂无评分
摘要
Abstract Real‐time stream data processing has gained high importance with the rapid rise of big data trends in different areas such as social media, finance, business, science, and bioinformatics. Stream data can be characterized as fast, unstable, and big data sets. Due to these properties of stream data, it cannot be processed effectively with traditional algorithms. Just like stream data processing, clustering is also a difficult task. However, researchers have attempted to classify stream data by modifying traditional algorithms or designing new ones. So, in previous studies, incremental methods were used for clustering the stream data. This paper highlights the need to develop an efficient real‐time clustering algorithm for data streams in the presence of concept high drift and an adaptive algorithm for different dimensions. The proposed clustering algorithm, SubtStream, combines decremental (subtractive property) and incremental (additivity property) strategies to overcome the high drift. It also introduces a new dimension‐based approach to adopt the dimension change. We use three radius parameters, Predefined User Parameter, Proactive Adaptive Parameter, and Reactive Adaptive Parameter, to achieve adaptability. The proposed method, SubtStream, showed better performance on synthetic and real data sets.
更多
查看译文
关键词
online subtractive <scp>subtstream</scp>,clustering,algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要