Continuous Trend-Based Clustering in Data Streams

DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS(2008)

引用 25|浏览0
暂无评分
摘要
Trend analysis of time series is an important problem since trend identification enables the prediction of the near future. In streaming time series the problem is more challenging due to the dynamic nature of the data. In this paper, we propose a method to continuously clustering a number of streaming time series based on their trend characteristics. Each streaming time series is transformed to a vector by means of the Piecewise Linear Approximation (PLA) technique. The PLA vector comprises pairs of values (timestamp, trend) denoting the starting time of the trend and the type of the trend (either UP or DOWN) respectively. A distance metric for PLA vectors is introduced. We propose split and merge criteria to continuously update the clustering information. Moreover, the proposed method handles outliers. Performance evaluation results, based on real-life and synthetic data sets, show the efficiency and scalability of the proposed scheme.
更多
查看译文
关键词
trend characteristic,time series,trend analysis,important problem,pla vector,synthetic data set,proposed scheme,trend identification,clustering information,data streams,distance metric,synthetic data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要