SDMA: An Efficient and Flexible Sparse-Dense Matrix-Multiplication Architecture for GNNs

2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)(2022)

引用 1|浏览43
暂无评分
摘要
In recent years, graph neural networks (GNNs) as a deep learning model have emerged. Sparse-Dense Matrix Multiplication (SpMM) is the critical component of GNNs. However, SpMM involves many irregular calculations and random memory accesses, resulting in the inefficiency of general-purpose processors and dedicated accelerators. The highly sparse and uneven distribution of the graph further exacerbates the above problems. In this work, we propose SDMA, an efficient architecture to accelerate SpMM for GNNs. SDMA can collaboratively address the challenges of load imbalance and irregular memory accesses. We first present three hardware-oriented optimization methods: 1) The Equal-value partition method effectively divides the sparse matrix to achieve load balancing between tiles. 2) The vertex-clustering optimization method can explore more data locality. 3) An adaptive on-chip dataflow scheduling method is proposed to make full use of computing resources. Then, we combine and integrate the above optimization into SDMA to achieve a high-performance architecture. Finally, we prototype SDMA on the Xilinx Alveo U50 FPGA. The results demonstrate that SDMA achieves 2.19x-3.35x energy efficiency over the GPU implementation and 2.03x DSP efficiency over the FPGA implementation.
更多
查看译文
关键词
GNNs,SpMM,Hardware Acceleration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要