Hydro: Adaptive Query Processing of ML Queries
arxiv(2024)
摘要
Query optimization in relational database management systems (DBMSs) is
critical for fast query processing. The query optimizer relies on precise
selectivity and cost estimates to effectively optimize queries prior to
execution. While this strategy is effective for relational DBMSs, it is not
sufficient for DBMSs tailored for processing machine learning (ML) queries. In
ML-centric DBMSs, query optimization is challenging for two reasons. First, the
performance bottleneck of the queries shifts to user-defined functions (UDFs)
that often wrap around deep learning models, making it difficult to accurately
estimate UDF statistics without profiling the query. This leads to inaccurate
statistics and sub-optimal query plans. Second, the optimal query plan for ML
queries is data-dependent, necessitating DBMSs to adapt the query plan on the
fly during execution. So, a static query plan is not sufficient for such
queries.
In this paper, we present Hydro, an ML-centric DBMS that utilizes adaptive
query processing (AQP) for efficiently processing ML queries. Hydro is designed
to quickly evaluate UDF-based query predicates by ensuring optimal predicate
evaluation order and improving the scalability of UDF execution. By integrating
AQP, Hydro continuously monitors UDF statistics, routes data to predicates in
an optimal order, and dynamically allocates resources for evaluating
predicates. We demonstrate Hydro's efficacy through four illustrative use
cases, delivering up to 11.52x speedup over a baseline system.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要