BloomStream: Data Temperature Identification for Flash Based Memory Storage Using Bloom Filters

2018 IEEE 11th International Conference on Cloud Computing (CLOUD)(2018)

引用 2|浏览57
暂无评分
摘要
Data temperature identification is an importance issue of many fields like data caching and storage tiering in modern flash-based storage systems. With the technological advancement of memory and storage, data temperature identification is no longer just a classification of hot and cold, but instead becomes a "multistreaming" data categorization problem to classify data into multiple categories according to their temperature. Therefore, we propose a novel data temperature identification scheme that adopts bloom filters to efficiently capture both frequency and recency of data blocks and accurately identify the exact data temperature for each data block. Moreover, in bloom filter data structure we replace the original OR operation with the XOR masking operation such that our scheme can delete or reset bits in bloom filters and thus avoid high false positives due to saturation. We further utilize twin bloom filters to alternatively keep unmasked clean copies of data and thus ensure low false negative rate. Our extensive evaluation results show that our new scheme can accurately identify the exact data temperature with low false identification rates across different synthetic and real I/O workloads. More importantly, our scheme consumes less memory space compared to other existing data temperature identification schemes.
更多
查看译文
关键词
Data Temperature, Bloom Filters, Stream Identification, Flash Memory, Multi-stream SSDs, Caching, Tiering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要