Analysis of HDFS under HBase: A Facebook Messages Case Study.

FAST'14: Proceedings of the 12th USENIX conference on File and Storage Technologies(2014)

引用 227|浏览463
暂无评分
摘要
We present a multilayer study of the Facebook Messages stack, which is based on HBase and HDFS. We collect and analyze HDFS traces to identify potential improvements, which we then evaluate via simulation. Messages represents a new HDFS workload: whereas HDFS was built to store very large files and receive mostly-sequential I/O, 90% of files are smaller than 15MB and I/O is highly random. We find hot data is too large to easily fit in RAM and cold data is too large to easily fit in flash; however, cost simulations show that adding a small flash tier improves performance more than equivalent spending on RAM or disks. HBase's layered design offers simplicity, but at the cost of performance; our simulations show that network I/O can be halved if compaction bypasses the replication layer. Finally, although Messages is read-dominated, several features of the stack (i.e., logging, compaction, replication, and caching) amplify write I/O, causing writes to dominate disk I/O.
更多
查看译文
关键词
new HDFS workload,large file,Facebook Messages,cold data,compaction bypass,cost simulation,hot data,replication layer,small flash tier,equivalent spending,facebook messages case study
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要