Scheduling with multi-level data locality: Throughput and heavy-traffic optimality

IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications(2016)

引用 61|浏览43
暂无评分
摘要
A fundamental problem to all data-parallel applications is data locality. An example is map task scheduling in the MapReduce framework. Existing theoretical work analyzes systems with only two levels of locality, despite the existence of multiple locality levels within and across data centers. We found that going from two to three levels of locality changes the problem drastically, as a tradeoff between performance and throughput emerges. The recently proposed priority algorithm, which is throughput and heavy-traffic optimal for two locality levels, is not even throughput-optimal with three locality levels. The JSQ-MaxWeight algorithm proposed by Wang et al. is heavy-traffic optimal only for a special traffic scenario with two locality levels. We show that an extension of the JSQ-MaxWeight algorithm to three locality levels preserves its throughput-optimality, but suffers from the same lack of heavy-traffic optimality for most traffic scenarios. We propose a novel algorithm that uses Weighted-Workload (WW) routing and priority service. We establish its throughput and heavy-traffic optimality for all traffic scenarios. The main challenge is the construction of an appropriate ideal load decomposition that allows the separate treatment of different subsystems.
更多
查看译文
关键词
multilevel data locality,heavy-traffic optimality,data-parallel applications,map task scheduling,MapReduce framework,data centers,JSQ-maxweight algorithm,throughput-optimality,weighted-workload routing,WW routing,priority service,load decomposition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要