Quick Access to Compressed Data in Storage Systems

2016 Data Compression Conference (DCC)(2016)

引用 0|浏览16
暂无评分
摘要
Summary form only given. Primary storage systems that compress data in real time, use some form of on disk metadata to perform the virtualization needed in storing compressed data. Usually this metadata is in the form of B-trees (eventually compressed) and stored on disk. For random accesses to compressed data, where the metadata is not in cache, this additional layer significantly slows down random reads and writes. Our solution is to use much less metadata that only provides an approximation of the location of compressed data on disk and can be easily stored in the memory of the storage system. Read operations are extended to compensate for the imprecise position information in the metadata, and index marks embedded in the data are used to locate the required data within the expanded read. The data placement of written data is constrained to be described by the reduced metadata. The placement uses a piecewise linear scheme based on the locality in compressibility of data and we support this assumption with experiments.
更多
查看译文
关键词
data compression,storage systems,MIPS,disk metadata,physical disk space,processor speeds,random reads,random writes,data placement,piecewise linear scheme,error window,compressed data writing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要