Optimal Resource Allocation of Cloud-Based Spark Applications

IEEE Transactions on Cloud Computing(2022)

引用 7|浏览14
暂无评分
摘要
Nowadays, the big data paradigm is consolidating its central position in the industry, as well as in society at large. Lots of applications, across disparate domains, operate on huge amounts of data and offer great advantages both for business and research. According to analysts, cloud computing adoption is steadily increasing to support big data analyses and Spark is expected to take a prominent market position for the next decade. As big data applications gain more and more importance over time and given the dynamic nature of cloud resources, it is fundamental to develop an intelligent resource management system to provide Quality of Service guarantees to end-users. This article presents a set of run-time optimization-based resource management policies for advanced big data analytics. Users submit Spark applications characterized by a priority and by a hard or soft deadline. Optimization policies address two scenarios: i) identification of the minimum capacity to run a Spark application within the deadline; ii) re-balance of the cloud resources in case of heavy load, minimising the weighted soft deadline application tardiness. The solution relies on an initial non-linear programming model formulation and a search space exploration based on simulation-optimization procedures. Spark application execution times are estimated by relying on a gamut of techniques, including machine learning, approximated analyses, and simulation. The benefits of the approach are evaluated on Microsoft Azure HDInsight and on a private cloud cluster based on POWER8 by considering the TPC-DS industry benchmark and SparkBench. The results obtained in the first scenario demonstrate that the percentage error of the prediction of the optimal resource usage with respect to system measurement and exhaustive search is in the range 4–29 percent while literature-based techniques present an average error in the range 6–63 percent. Moreover, in the second scenario, the proposed algorithms can address complex problems like computing the optimal redistribution of resources among tens of applications in less than a minute with an error of 8 percent on average. On the same considered tests, literature-based approaches obtain an average error of about 57 percent.
更多
查看译文
关键词
Big data,quality of service,elastic resource provisioning,cluster management
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要