Adaptive AI-based auto-scaling for Kubernetes

2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID)(2020)

引用 25|浏览6
暂无评分
摘要
Kubernetes, the prevalent container orchestrator for cloud-deployed web applications, offers an automatic scaling feature for the application provider in order to meet the ever-changing amount of demand from its clients. This auto-scaling service, however, requires a seemingly difficult parameter set to be customized by the application provider, and those management parameters are static while incoming web request dynamics often change, not to mention the fact that scaling decisions are inherently reactive, instead of being proactive. Therefore we set the ultimate goal of making cloud-based web applications’ management easier and more effective.We propose a Kubernetes scaling engine that makes the auto-scaling decisions apt for handling the actual variability of incoming requests. In this engine various AI-based forecast methods compete with each other via a short-term evaluation loop in order to always give the lead to the method that suits best the actual request dynamics, as soon as possible. We also introduce a compact management parameter for the cloud-tenant application provider in order to easily set their sweet spot in the resource over-provisioning vs. SLA violation trade-off.The multi-forecast scaling engine and the proposed management parameter are evaluated both in simulations and with measurements on our collected web traces to show the improved quality of fitting provisioned resources to service demand. We find that with just a few competing forecast methods, our auto-scaling engine, implemented in Kubernetes, results in significantly less lost requests with slightly more provisioned resources compared to the default baseline.
更多
查看译文
关键词
cloud computing,artificial intelligence,auto-scaling,Kubernetes,forecast,resource management
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要