Query Rewriting via Large Language Models
arxiv(2024)
摘要
Query rewriting is one of the most effective techniques for coping with
poorly written queries before passing them down to the query optimizer. Manual
rewriting is not scalable, as it is error-prone and requires deep expertise.
Similarly, traditional query rewriting algorithms can only handle a small
subset of queries: rule-based techniques do not generalize to new query
patterns and synthesis-based techniques cannot handle complex queries.
Fortunately, the rise of Large Language Models (LLMs), equipped with broad
general knowledge and advanced reasoning capabilities, has created hopes for
solving some of these previously open problems.
In this paper, we present GenRewrite, the first holistic system that
leverages LLMs for query rewriting. We introduce the notion of Natural Language
Rewrite Rules (NLR2s), and use them as hints to the LLM but also a means for
transferring knowledge from rewriting one query to another, and thus becoming
smarter and more effective over time. We present a novel counterexample-guided
technique that iteratively corrects the syntactic and semantic errors in the
rewritten query, significantly reducing the LLM costs and the manual effort
required for verification. GenRewrite speeds up 22 out of 99 TPC queries (the
most complex public benchmark) by more than 2x, which is 2.5x–3.2x higher
coverage than state-of-the-art traditional query rewriting and 2.1x higher than
the out-of-the-box LLM baseline.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要