Reduce, Reuse, and Adapt: Accelerating Graph Processing on GPUs.

International Conference on High Performance Computing, Data, and Analytics(2023)

引用 0|浏览8
暂无评分
摘要
Designing parallel graph algorithms on GPUs has been challenging. We observe three limitations with the existing work. First, algorithms often rely on only one of the strategies to propagate information: push or pull. We observe that neither is an optimal choice in many cases. Second, the cost of updating the underlying data structures per iteration is high. This results in a significant performance overhead. Third, considering the in-herent irregularity of graph processing, one-size-fits-all approach is too rigid for different types of graphs. In this work, we address these shortcomings by improving the processing of an existing graph framework, Subway. In particular, we propose a novel technique in terms of amalgamating the two propagation strategies (push and pull) into a hybrid traversal strategy. In this, the vertices of the graph propagate their information by pulling the information from the neighbours, performing a local computation, and subsequently pushing the result to all the neighbours, all within an iteration. We propose to reuse the SubCSR structure in Subway across a few iterations to significantly reduce the computational overhead, but without compromising the correctness or efficiency of the algorithm. Furthermore, we explore heuristics on when to use push, pull, or hybrid traversal strategies. We illustrate the effectiveness of our three-pronged approach by applying it to four popular graph algorithms: Connected Components (CC), Single-Source Shortest Path (SSSP), Breadth First Search (BFS) and Page Rank (PR) on an NVIDIA GeForce RTX 3060 GPU. Our extensive experimental evaluation on GeForce RTX 3060 GPU reveals that the proposed hybrid approach with adaptive heuristics and approximate subCSR computation is effective in reducing the execution time of CC, SSSP, and PR by 31 %, 7.56%, and 6.43% respectively, compared to the minimum of push or pull algorithm that uses subCSR structure.
更多
查看译文
关键词
Graphs,Connected Components,GPU,Push,Pull,Hybrid,SSSP,BFS,PR,CC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要