FastSO: A Fast Weighted Cardinality Estimation Algorithm

2023 3rd International Conference on Electronic Information Engineering and Computer (EIECT)(2023)

引用 0|浏览1
暂无评分
摘要
Weighted cardinality estimation is widely used in network traffic monitoring and database fields, such as evaluating the number and size of duplicate data packets in network traffic, estimating the popularity, trend of keywords in the database, and so on. Existing research is mostly based on data sketches for cardinality estimation, which is to establish sketches in a data stream of size n, store them in k buckets, and use statistical information to unbiased estimate the data. As the amount of data increases, the computational scale of the hash function in the sketch estimation algorithm also significantly increases. Each data needs to calculate k hash values, and the algorithm has a time complexity of O(nk). This paper proposes a weighted cardinality estimation algorithm based on random optimization and parallel acceleration. In the hash function calculation stage, the central limit theorem is used to randomly sample the data bucket, and each data only needs one sampling and one hash calculation. The average time complexity of the algorithm has been reduced to the O(n) level. In the calculation phase, multiple data are aggregated into a single piece of data, and the loop is unfolded in a vectorized manner. SIMD parallelization is used to accelerate the simultaneous calculation of multiple data. The experimental results show that this algorithm can achieve acceleration of 5 to 100 times while maintaining the accuracy of the relative baseline algorithm.
更多
查看译文
关键词
Weighted cardinality estimation,Data sketch,Random optimization,Parallel acceleration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要