A Data Structure for Efficient File Deduplication in Cloud Storage

2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)(2020)

引用 0|浏览2
暂无评分
摘要
With the rapid development of Internet, massive data needs to be stored, bringing a significant challenge for cloud storage systems. It is notable that among the data, there are plenty of duplicates file or chunks that can be deduplicated to achieve better spatial efficiency. And many approximate set membership data structures, such as Bloom Filter(BF) and Cuckoo Filter(CF), have been used to accelerate the whole deduplication process. However, errors will inevitably occur as these data structures only store summary information, and the error rate is directly related to the performance bottleneck of the deduplication system. To address these problems, we propose an advanced Cuckoo Filter named Split Position Aware Cuckoo Filter (SPACF) which can noticeably decrease the error rate. We implement the SPACF and compare it with other kinds of CFs and BF, and the experiment results illustrate that the false positive rate of our SPACF is around 50% to Standard Cuckoo Filter and 10% to Counting Bloom Filter.
更多
查看译文
关键词
data deduplication,Cuckoo Filter,approximate set membership data structure,cloud storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要