Distance Comparison Operators for Approximate Nearest Neighbor Search: Exploration and Benchmark
arxiv(2024)
摘要
Approximate nearest neighbor search (ANNS) on high-dimensional vectors has
become a fundamental and essential component in various machine learning tasks.
Prior research has shown that the distance comparison operation is the
bottleneck of ANNS, which determines the query and indexing performance. To
overcome this challenge, some novel methods have been proposed recently. The
basic idea is to estimate the actual distance with fewer calculations, at the
cost of accuracy loss. Inspired by this, we also propose that some classical
techniques and deep learning models can also be adapted to this purpose. In
this paper, we systematically categorize the techniques that have been or can
be used to accelerate distance approximation. And to help the users understand
the pros and cons of different techniques, we design a fair and comprehensive
benchmark, Fudist implements these techniques with the same base index and
evaluates them on 16 real datasets with several evaluation metrics. Designed as
an independent and portable library, Fudist is orthogonal to the specific index
structure and thus can be easily utilized in the current ANNS library to
achieve significant improvements.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要