Chrome Extension
WeChat Mini Program
Use on ChatGLM

GERBIL: MPI+YARN

CCGRID(2015)

Cited 16|Views35
No score
Abstract
Emerging big data applications comprise rich multi-faceted workflows with both compute-intensive and data-intensive tasks, and intricate communication patterns. While MapReduce is an effective model for data-intensive tasks, the MPI programming model may be better suited for extracting high-performance for compute-intensive tasks. Researchers have recognized this need to employ specialized models for different phases of a workflow, e.g., performing computations using MPI followed by visualizations using MapReduce. However, extant multi-cluster approaches are inefficient as they entail data movement across clusters and porting across data formats. Consequently, there is a crucial need for disparate programming models to co-exist on the same set of resources. In this paper, we address the above issue by designing GERBIL, a framework for transparently co-hosting unmodified MPI applications alongside MapReduce applications on the same cluster. GERBIL exploits YARN as the model agnostic resource negotiator, and provides an easy-to-use interface to the users. GERBIL bridges the fundamental mismatch between YARN and MPI by designing an MPI-aware resource allocation mechanism. We also support five different optimizations: minimizing job wait time, achieving inter-process locality, achieving desired cluster utilization, minimizing network traffic, and minimizing job execution time, all in a multi-tenant environment. Our evaluation shows that GERBIL enables MPI executions with performance comparable to a native MPI setup, and improve compute-intensive applications performance by up to 133% when compared to the corresponding MapReduce-based versions.
More
Translated text
Key words
Big Data,application program interfaces,data handling,message passing,parallel processing,resource allocation,GERBIL bridges,MPI applications,MPI programming model,MPI+YARN,MPI-aware resource allocation,MapReduce applications,big data applications,compute-intensive applications,compute-intensive tasks,data-intensive tasks,disparate programming models,high-performance tasks,model agnostic resource negotiator,multicluster approaches,multifaceted workflows,
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined