Shasta: Interactive Reporting At Scale

SIGMOD/PODS'16: International Conference on Management of Data San Francisco California USA June, 2016(2016)

引用 8|浏览96
暂无评分
摘要
We describe Shasta, a middleware system built al Google to support interactive reporting in complex user-facing applications related to Google's Internet advertising business. Shasta targets applications with challenging requirements: First, user query latencies must be low, Second, underlying transactional data stores have complex "read-unfriendly" schemas, placing significant transformation logic between stored data and the read-only views that Shasta exposes to its clients. This transformation logic must be expressed in a way that scales to large and agile engineering teams. Finally, Shasta targets applications with strong data freshness requirements, making it challenging to precompute query results using common techniques such as EnT. pipelines or materialized views. Instead, online queries must go all the way from primary storage to user facing views, resulting in complex queries joining 50 or more tables.Designed as a layer on top of Google's Fl RDBMS and Mesa data warehouse, Shasta combines language and system techniques to meet these requirements. To help with expressing complex view specifications, we developed a query language called RVL, with support for modularized view templates that can be dynamically compiled into SQL, To execute these SQL queries with low latency at scale, we leveraged and extended Ft's distributed query engine with facilities such as safe execution of C++ and Java TIDFs, To reduce latency and increase read parallelism, we extended Fl storage with a distributed read-only in-memory cache. The system we describe is in production at Google, powering critical applications used by advertisers and internal sales teams. Shasta has significantly improved system scalability and software engineering efficiency compared to the middleware solutions it replaced.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要