Chrome Extension
WeChat Mini Program
Use on ChatGLM

P-QALSH+: Exploiting Multiple Cores to Parallelize Query-Aware Locality-Sensitive Hashing on Big Data.

Yikai Huang, Zezhao Hu,Jianlin Feng

Asia-Pacific Web Conference(2022)

Cited 0|Views14
No score
Abstract
Approximate nearest neighbor (ANN) search in high dimensional Euclidean space is a fundamental problem of big data processing. Locality-Sensitive Hashing (LSH) is a popular scheme to solve the ANN search problem. In the index phase, an LSH scheme needs to preprocess multiple hash tables, and in the query phase it exploits the preprocessed hash tables to speedup the ANN search. Query-Aware LSH (QALSH), a state-of-the-art LSH scheme, has rigorous theoretical guarantee on query accuracy, while suffering from high time overhead in the index and query phase. To improve the query efficiency, a multi-core parallel QALSH scheme called P-QALSH was proposed, which is mainly optimized for the query phase. In this paper, we further extend P-QALSH to P-QALSH+, which parallelizes QALSH in both the index and query phases based on multiple cores. Specifically, we first propose a Parallel Table Design to fully accelerate the index construction. Then, we follow P-QALSH to exploit a novel K-Counter Parallel Counting Technology and a novel Search Radius Estimation Strategy to improve the query performance. Using six real-world datasets and eight synthetic datasets, we have performed extensive experiments on a 16-core machine. Experimental results demonstrate the superiority of P-QALSH+ in terms of efficiency of parallel computing. Specifically, compared to QALSH, P-QALSH+ is 10-12X faster on index construction, and achieves 6-8X speedup on query search, and notably shows obvious improvement in query accuracy.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined