Automatic cluster parallelization and minimizing communication via selective data replication

2015 IEEE High Performance Extreme Computing Conference (HPEC)(2015)

引用 1|浏览69
暂无评分
摘要
The technology scaling has initiated two distinct trends that are likely to continue into future: first, the increased parallelism in hardware and second, the increasing performance and energy cost of communication relative to computation. Both of the above trends call for development of compiler and runtime systems to automatically parallelize programs and reduce communication in parallel computations to achieve the desired high performance in an energy-efficient fashion. In this paper, we propose the design of an integrated compiler and runtime system that auto-parallelizes loop-nests to clusters and, a novel communication avoidance method that reduces data movement between processors. Communication minimization is achieved via data replication: data is replicated so that a larger share of the whole data set may be mapped to a processor and hence, non-local memory accesses reduced. Experiments on a number of benchmarks show the effectiveness of the approach.
更多
查看译文
关键词
automatic cluster parallelization,communication minimization,selective data replication,technology scaling,hardware parallelism,compiler,runtime system,automatic program parallelization,parallel computation,energy efficiency,loop-nest autoparallelization,communication avoidance method,processor data movement,nonlocal memory access
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要