SQL-Encoder: Improving NL2SQL In-Context Learning Through a Context-Aware Encoder

Mohammadreza Pourreza,Davood Rafiei, Yuxi Feng, Raymond Li,Zhenan Fan, Weiwei Zhang

arxiv(2024)

引用 0|浏览2
暂无评分
摘要
Detecting structural similarity between queries is essential for selecting examples in in-context learning models. However, assessing structural similarity based solely on the natural language expressions of queries, without considering SQL queries, presents a significant challenge. This paper explores the significance of this similarity metric and proposes a model for accurately estimating it. To achieve this, we leverage a dataset comprising 170k question pairs, meticulously curated to train a similarity prediction model. Our comprehensive evaluation demonstrates that the proposed model adeptly captures the structural similarity between questions, as evidenced by improvements in Kendall-Tau distance and precision@k metrics. Notably, our model outperforms strong competitive embedding models from OpenAI and Cohere. Furthermore, compared to these competitive models, our proposed encoder enhances the downstream performance of NL2SQL models in 1-shot in-context learning scenarios by 1-2% for GPT-3.5-turbo, 4-8% for CodeLlama-7B, and 2-3% for CodeLlama-13B.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要