TurBO: A cost-efficient configuration-based auto-tuning approach for cluster-based big data frameworks.

Social Science Research Network(2023)

引用 1|浏览2
暂无评分
摘要
Big data processing frameworks such as Spark usually provide a large number of performance-related configuration parameters, how to auto-tune these parameters for a better performance has been a hot issue in academia as well as industry for years. Through delicately tradeoff between exploration and exploitation, Bayesian Optimization (BO) is currently the most appealing algorithm to achieve configuration auto-tuning. However, considering the tuning cost constraint in practice, there are three critical limitations preventing conventional BO-based approaches from being directly applied into auto -tuning cluster-based big data frameworks. In this paper, we propose a cost-efficient configuration auto-tuning approach named TurBO for big data frameworks based on two enhancements of vanilla BO:1) To reduce the essential iteration times, TurBO integrates a well-designed adaptive pseudo point mechanism with BO; 2) To avoid the time-consuming practical evaluation of sub-optimal configurations as possible, TurBO leverages the proposed CASampling method to intelligently tackle with these sub-optimal configurations based on ensemble learning with historical tuning experiences. To evaluate the performance of TurBO, we conducted a series of experiments on a local Spark cluster with 9 different HiBench benchmark applications. Overall, compared with 3 representative BO-based baseline approaches OpenTuner, Bliss and ResTune, TurBO is able to speedup the tuning procedures respectively by 2.24x, 2.29x and 1.97x on average. Besides, TurBO can always achieve a positive cumulative performance gain under the simulated dynamic workload scenario, which means TurBO is indeed appropriate for workload changes of big data applications.(c) 2023 Elsevier Inc. All rights reserved.
更多
查看译文
关键词
Big data framework,Configuration parameter,Tuning cost,Bayesian optimization,Pseudo point
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要