SparGD: A Sparse GEMM Accelerator with Dynamic Dataflow

Bo Wang,Sheng Ma, Shengbai Luo,Lizhou Wu,Jianmin Zhang,Chunyuan Zhang,Tiejun Li

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS（2024）

引用 0|浏览6

暂无评分

摘要

Deep learning has become a highly popular research field, and previously deep learning algorithms ran primarily on CPUs and GPUs. However, with the rapid development of deep learning, it was discovered that existing processors could not meet the specific large-scale computing requirements of deep learning, and custom deep learning accelerators have become popular. The majority of the primary workloads in deep learning are general matrix-matrix multiplications (GEMMs), and emerging GEMMs are highly sparse and irregular. The TPU and SIGMA are typical GEMM accelerators in recent years, but the TPU does not support sparsity, and both the TPU and SIGMA have insufficient utilization rates of the Processing Element (PE). We design and implement SparGD, a sparse GEMM accelerator with dynamic dataflow. SparGD has specific PE structures, flexible distribution networks and reduction networks, and a simple dataflow switching module. When running sparse and irregular GEMMs, SparGD can maintain high PE utilization while utilizing sparsity, and can switch to the optimal dataflow according to the computing environment. For sparse, irregular GEMMs, our experimental results show that SparGD outperforms systolic arrays by 30 times and SIGMA by 3.6 times.

查看译文

关键词

GEMM,accelerators,sparsity,dynamic dataflow

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要