Robust and Efficient Sorting with Offset-Value Coding

arxiv(2022)

引用 4|浏览20
暂无评分
摘要
Sorting and searching are large parts of database query processing, e.g., in the forms of index creation, index maintenance, and index lookup; and comparing pairs of keys is a substantial part of the effort in sorting and searching. We have worked on simple, efficient implementations of decades-old, neglected, effective techniques for fast comparisons and fast sorting, in particular offset-value coding. In the process, we happened upon its mutually beneficial relationship with prefix truncation in run files as well as the duality of compression techniques in row- and column-format storage structures, namely prefix truncation and run-length encoding of leading key columns. We also found a beneficial relationship with consumers of sorted streams, e.g., merging parallel streams, in-stream aggregation, and merge join. We report on our implementation in the context of Google's Napa and F1 Query systems as well as an experimental evaluation of performance and scalability.
更多
查看译文
关键词
Sorting,merging,grouping,duplicate removal,offset-value coding,priority queue,tree-of-losers
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要