SOHAC: efficient storage of tick data that supports search and analysis

ICDM'12 Proceedings of the 12th Industrial conference on Advances in Data Mining: applications and theoretical aspects(2012)

引用 8|浏览0
暂无评分
摘要
Storage of tick data is a challenging problem because two criteria have to be fulfilled simultaneously: the storage structure should allow fast execution of queries and the data should not occupy too much space on the hard disk or in the main memory. In this paper, we present a clustering-based solution, and we introduce a new clustering algorithm that is designed to support the storage of tick data. We evaluate our algorithm both on publicly available real-world datasets, as well as real-world tick data from the financial domain provided by one of the world-wide most renowned investment bank. In our experiments we compare our approach, SOHAC, against a large collection of conventional hierarchical clustering algorithms from the literature. The experiments show that our algorithm substantially outperforms --- both in terms of statistical significance and practical relevance --- the examined clustering algorithms for the tick data storage problem.
更多
查看译文
关键词
tick data storage problem,efficient storage,available real-world datasets,conventional hierarchical clustering algorithm,challenging problem,storage structure,tick data,clustering-based solution,real-world tick data,fast execution,new clustering algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要