Autoscaling of Containerized HPC Clusters in the Cloud

Nicolas Greneche,Christophe Cerin

2022 IEEE/ACM International Workshop on Interoperability of Supercomputing and Cloud Technologies (SuperCompCloud)（2022）

引用 0|浏览1

暂无评分

摘要

This paper introduces a Cloud orchestrator controller that enables the autoscaling of containerized HPC Clusters in the Cloud. This controller triggers the creation or suppression of containerized HPC compute nodes according to metrics collected at the containerized HPC scheduler's job queue level. This paper uses Kubernetes as the Cloud orchestrator and OAR as the HPC scheduler, and our approach does not modify the Kubernetes Cloud orchestrator or the OAR HPC scheduler. The scheme followed in this paper is generic and can be applied to other HPC schedulers. We assume that containerization principles facilitate the reproducibility of experiments by adding the HPC scheduler to the environment replayed by the end user. The paper exemplifies Cloud and HPC convergence to allow a high degree of flexibility for users and community platform developers. It also explores continuous integration/deployment approaches of Cloud computing to orchestrate multiple and potentially different HPC job schedulers that scale under the supervision of the Cloud Orchestrator. The experimental part of the work is highly reproducible, demonstrating the advanced nature of our research work.

查看译文

关键词

Integrated HPC and Cloud environments,Software-defined infrastructure,HPC infrastructure deployment use cases,Interoperability of HPC and Cloud resource management and scheduling systems

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要