Scaling spark in the real world: performance and usability
Proceedings of The Vldb Endowment(2015)
摘要
Apache Spark is one of the most widely used open source processing engines for big data, with rich language-integrated APIs and a wide range of libraries. Over the past two years, our group has worked to deploy Spark to a wide range of organizations through consulting relationships as well as our hosted service, Databricks. We describe the main challenges and requirements that appeared in taking Spark to a wide set of users, and usability and performance improvements we have made to the engine in response.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络