Semantic SQL – Combining and optimizing semantic predicates in SQL
arxiv(2024)
摘要
In recent years, the surge in unstructured data analysis, facilitated by
advancements in Machine Learning (ML), has prompted diverse approaches for
handling images, text documents, and videos. Analysts, leveraging ML models,
can extract meaningful information from unstructured data and store it in
relational databases, allowing the execution of SQL queries for further
analysis. Simultaneously, vector databases have emerged, embedding unstructured
data for efficient top-k queries based on textual queries. This paper
introduces a novel framework SSQL - Semantic SQL that utilizes these two
approaches, enabling the incorporation of semantic queries within SQL
statements. Our approach extends SQL queries with dedicated keywords for
specifying semantic queries alongside predicates related to ML model results
and metadata. Our experimental results show that using just semantic queries
fails catastrophically to answer count and spatial queries in more than 60
the cases. Our proposed method jointly optimizes the queries containing both
semantic predicates and predicates on structured tables, such as those
generated by ML models or other metadata. Further, to improve the query
results, we incorporated human-in-the-loop feedback to determine the optimal
similarity score threshold for returning results.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要