Bdap: A Big Data Placement Strategy For Cloud-Based Scientific Workflows

BIGDATASERVICE '15: Proceedings of the 2015 IEEE First International Conference on Big Data Computing Service and Applications(2015)

引用 33|浏览17
暂无评分
摘要
In this new era of Big Data, there is a growing need to enable scientific workflows to perform computations at a scale far exceeding a single workstation's capabilities. When running such data intensive workflows in the cloud distributed across several physical locations, the execution time and the resource utilization efficiency highly depends on the initial placement and distribution of the input datasets across these multiple virtual machines in the Cloud. In this paper, we propose BDAP (Big DAta Placement strategy), a strategy that improves workflow performance by minimizing data movement across multiple virtual machines. In this work, we 1) formalize the data placement problem in scientific workflows, 2) propose a data placement algorithm that considers both initial input dataset and intermediate datasets obtained during workflow run, and 3) perform extensive experiments in the distributed environment to verify that our proposed strategy provides an effective data placement solution to distribute and place big datasets at the appropriate virtual machines in the Cloud within reasonable time.
更多
查看译文
关键词
Big Data,Data Placement,Cloud Computing,Scientific Workflow,Evolutionary Algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要