An Incremental Reinforcement Learning Scheduling Strategy For Data-Intensive Scientific Workflows In The Cloud

André Nascimento,Vítor Silva,Aline Paes,Daniel de Oliveira

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE（2021）

引用 2|浏览18

暂无评分

摘要

Most scientific experiments can be modeled as workflows. These workflows are usually computing- and data-intensive, demanding the use of high-performance computing environments such as clusters, grids, and clouds. This latter offers the advantage of the elasticity, which allows for changing the number of virtual machines (VMs) on demand. Workflows are typically managed using scientific workflow management systems (SWfMS). Many existing SWfMSs offer support for cloud-based execution. Each SWfMS has its scheduler that follows a well-defined cost function. However, such cost functions should consider the characteristics of a dynamic environment, such as live migrations or performance fluctuations, which are far from trivial to model. This article proposes a novel scheduling strategy, named ReASSIgN, based on reinforcement learning (RL). By relying on an RL technique, one may assume that there is an optimal (or suboptimal) solution for the scheduling problem, and aims at learning the best scheduling based on previous executions in the absence of a mathematical model of the environment. For this, an extension of a well-known workflow simulator WorkflowSim is proposed to implement an RL strategy for scheduling workflows. Once the scheduling plan is generated via simulation, the workflow is executed in the cloud using SciCumulus SWfMS. We conducted a throughout evaluation of the proposed scheduling strategy using a real astronomy workflow named Montage.

查看译文

关键词

compute cloud, parallelism, reinforcement learning, workflow scheduling

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要