Automating wavefront parallelization for sparse matrix computations.

SC(2016)

引用 61|浏览176
暂无评分
摘要
This paper presents a compiler and runtime framework for parallelizing sparse matrix computations that have loop-carried dependences. Our approach automatically generates a runtime inspector to collect data dependence information and achieves wavefront parallelization of the computation, where iterations within a wavefront execute in parallel, and synchronization is required across wavefronts. A key contribution of this paper involves dependence simplification, which reduces the time and space overhead of the inspector. This is implemented within a polyhedral compiler framework, extended for sparse matrix codes. Results demonstrate the feasibility of using automatically-generated inspectors and executors to optimize ILU factorization and symmetric Gauss-Seidel relaxations, which are part of the Preconditioned Conjugate Gradient (PCG) computation. Our implementation achieves a median speedup of 2.97X on 12 cores over the reference sequential PCG implementation, significantly outperforms PCG parallelized using Intel's Math Kernel Library (MKL), and is within 6% of the median performance of manually-parallelized PCG.
更多
查看译文
关键词
wavefront parallelization automation,runtime framework,sparse matrix computation parallelization,loop-carried dependence,runtime inspector generation,data dependence information collection,parallel iteration,wavefront synchronization,dependence simplification,time overhead reduction,space overhead reduction,polyhedral compiler framework,sparse matrix codes,ILU factorization optimization,symmetric Gauss-Seidel relaxation,preconditioned conjugate gradient computation,PCG computation,sequential PCG implementation,Intel Math Kernel Library,Intel MKL
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要