Multi-Dimensional Index Over A Key-Value Store For Semi-Structured Data

BIG SCIENTIFIC DATA MANAGEMENT(2019)

引用 0|浏览18
暂无评分
摘要
The informal data structures and trillions of data volume are the challenges for databases to store and retrieve semi-structured data. Most researchers deal with the issues through R-Tree, KD-tree and space curves, but these structures are not suitable for default and discrete values of semi-structured data, and even require sampling before storage. We present MD-Index, a scalable multi-dimensional indexing system that supports high-throughput and real-time range queries. MD-Index builds bitmap index of sliced data over a range partitioned Key-value store. The underlying Key-value store guarantees high throughput, large data storage, high availability and fault tolerance of the system, and bitmap provides multi-dimensional index of data. Meanwhile, MD-Index encodes the discrete values as the hash code of a slice, and stores the data and the bitmap of a slice in the same region (a storage unit of the range partitioned Key-value store) to utilize distributed computing and data locality. Our prototype of MD-Index is built on HBase, the standard Key-value database. Experimental results reveal that MD-Index is capable of storing and retrieving trillions of semi-structured data and achieving a throughput of two million records per second.
更多
查看译文
关键词
Multi-dimensional index, Key-value store, Bitmap index, Semi-structured data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要