PABIRS: A data access middleware for distributed file systems

ICDE(2015)

引用 7|浏览41
暂无评分
摘要
Various big data management systems have emerged to handle different types of applications, which cast very different demands on storage, indexing and retrieval of large amount of data on distributed file system. Such diversity on demands has raised huge challenges to the design of new generation of data access service for big data. In this paper, we present PABIRS, a unified data access middleware to support mixed workloads. PABIRS encapsulates the underlying distributed file system (DFS) and provides a unified access interface to systems such as MapReduce and key-value stores. PABIRS achieves dramatic improvement on efficiency by employing a novel hybrid indexing scheme. Based on the data distribution, the indexing scheme adaptively builds bitmap index and Log Structured Merge Tree (LSM) index. Moreover, PABIRS distributes the computation to multiple index nodes and utilizes a Pregel-based algorithm to facilitate parallel data search and retrieval. We empirically evaluate PABIRS against other existing distributed data processing systems and verify the huge advantages of PABIRS on shorter response time, higher throughput and better scalability, over big data with real-life phone logs and TPC-H benchmark.
更多
查看译文
关键词
cloud computing,distributed databases,middleware,LSM index,PABIRS,Pregel-based algorithm,big data management system,bitmap index,cloud computing,data access middleware,distributed file systems,log structured merge tree index,
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要