One, Two, Hash! Counting Hash Tables for Flash Devices

CODS '14: Proceedings of the 1st IKDD Conference on Data Sciences(2014)

引用 1|浏览3
暂无评分
摘要
In recent years, advances in hardware technology have led to the increasingly wide spread use of flash storage devices. Such devices have clear benefits over traditional hard drives in terms of latency of access, bandwidth, and random access capabilities particularly when reading data. However, there are some interesting tradeoffs. On a relative scale, writing to such devices can be expensive. This is because typical flash devices (NAND technology) are updated in blocks. A minor update to a given block requires the entire block to be erased, also referred to as cleaned, followed by a re-writing of the block. On the other hand, sequential writes can be two orders of magnitude faster than random writes. In addition, random writes are degrading to the life of the flash drive because each block can support only a limited number of cleaning operations. Hash tables are a particularly challenging case for the flash drive because this data structure is inherently dependent upon the randomness of the hash function, as opposed to the spatial locality of the data. Thus it is difficult to avoid random writes. In this paper, we will study the design landscape for the development of a hash table for flash storage devices. We demonstrate design tradeoffs with the design of a hash table by using two related hash functions, one of which exhibits a data placement property with respect to the other. Specifically, we focus on three designs based on this general philosophy and evaluate the trade-offs among them along the axes of query performance, insert and update times, and I/O time.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要