High Performance Multilevel Graph Partitioning on GPU.

HPCS(2019)

引用 4|浏览14
暂无评分
摘要
Graph partitioning is a common computational phase in many application domains, including social network analysis, data mining, scheduling, and VLSI design. The significant SIMT compute power of a GPU makes it an appropriate platform to exploit data parallelism in graph partitioning and accelerate the computation. However, irregular, non-uniform, and data-dependent graph partitioning sub-tasks pose multiple challenges for efficient GPU utilization. Some of these challenges include load imbalance, non-coalesced memory accesses, and warp execution inefficiency. In this paper, we describe an effective and methodological approach to enable multi-level graph partitioning on GPUs. Our solution avoids thread divergence and balances the load over GPU threads by dynamically assigning appropriate number of threads to process the graph vertices and their irregular sized neighbors. Our design is autonomous, i.e., all the steps are carried out by the GPU with minimal CPU involvement, which is required for a range of GPU applications as a pre-processing step. We show that our approach performs better and is comparable in partitioning quality with respect to the state-of-the-art CPU-based parallel graph partitioner (mtmetis). Moreover, to the best of our knowledge, it is the first autonomous approach on GPU.
更多
查看译文
关键词
graph partitioning,matching,coarsening,uncoarsening,refinement,parallel computing on GPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要