A Generic Efficient Scientific Workflow Engine for the Optimizations of Run-Time Execution

Changxin Bai,Junwen Liu, Anik Tahabilder, M M Imran,Shiyong Lu,Dunren Che

2023 IEEE International Conference on Software Services Engineering (SSE)(2023)

引用 0|浏览5
暂无评分
摘要
Workflow has proven to be a highly effective computing model for a variety of scientific applications, offering flexible data types and unstructured parallelism that surpasses simple parallel execution models such as MapReduce. However, current workflow management systems in cloud computing environments experience unnecessary delays in task execution due to the separation of task execution and data transfer processes, which causes a child task to wait until all its predecessor tasks complete, rather than waiting only for necessary input data becoming ready. The goal of this paper is to eliminate the unnecessary delay of child tasks in a workflow, which is achieved through a new workflow engine architecture that separates workflow planner from workflow executor in the general framework of the DATAVIEW scientific workflow management system. This new engine architecture can be generalized and applied to other workflow systems. Our design integrates a new task release mechanism based on a data dependency model with the workflow executor of DATAVIEW. This approach enables prompt task launching once input data becomes available, instead of waiting for all predecessor tasks to finish. The architecture employs distributed algorithms for implementing the workflow executor and the task executors, performing various optimization on data movement, task movement, and communication among different subsystems. The experiments show that our new architecture based on the new task release model can significantly reduce overall execution time of a workflow in DATAVIEW.
更多
查看译文
关键词
Workflow Engine, Workflow Execution, Parallel Processing, Task Releasing, Optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要