MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying.

Proceedings of the ACM on Management of Data(2023)

引用 0|浏览1
暂无评分
摘要
LSM-based key-value stores have been leveraged in many state-of-the-art data-intensive applications as storage engines. As data volume scales up, a cost-efficient approach is to deploy these applications on hybrid cloud storage with hot/cold separation, which splits the LSM-tree into two parts and thus brings new challenges on how to split and how to close the significant performance gap between these two parts. Existing LSM-tree key-value stores mainly focus on the optimizations of local storage, which incurs sub-optimal performance when directly applied to hybrid storage. In this paper, we present MirrorKV for efficient compaction and querying on hybrid cloud storage. First, based on the capacities of fast and slow cloud storage, MirrorKV vertically separates hot/cold data of different levels stored in different cloud storage with different compaction mechanisms. To avoid compaction in slow storage being the bottleneck of the write path, MirrorKV proposes a novel virtual split to only compact the metadata during the compaction, which postpones the actual compaction until it reaches deep enough levels. Second, to reduce accessing slow storage during querying, MirrorKV horizontally separates keys and values into two mirrored LSM-trees to differentiate caching priorities; the maintained tree structures preserve the data locality for efficient sequential reading without incurring the overhead of the traditional key-value separation solutions. Finally, MirrorKV leverages cached data to guide the compaction where the hot data is retained in the fast storage while the cold data is compacted to deeper levels in slow storage. Compared with RocksDB-cloud, MirrorKV achieves 2.4× higher random insertion throughput, 29% higher random read throughput, and 99% less compaction time.
更多
查看译文
关键词
cloud storage,key-value stores
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要