A simple and efficient storage format for SIMD-accelerated SpMV

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS(2021)

引用 1|浏览3
暂无评分
摘要
SpMV (Sparse matrix-vector multiplication) is an essential component in scientific computing and has attracted the attention of researchers in related fields at home and abroad. With the continuous expansion of matrix data, the efficient parallel SpMV algorithm has become a research hotspot for research experts in related fields. The sparse matrix compression format as a critical point to improve computing performance can effectively save storage space and efficiently cooperate with the advantages of the processor system structure to give full play to performance. This paper proposes a new sparse matrix storage format CSR2 (Compressed Sparse Row 2). It is a new single format and suitable for processor platforms with SIMD (Single Instruction Multiple Data) vectorizations. The format operation of CSR2 is easy to implement with a low overhead of conversion. We compared the SpMV algorithm based on CSR2 with the most advanced single format CSR5 (Compressed Sparse Row 5) and Intel MKL (Intel Math Kernel Library) on the mainstream high-performance processor Intel Xeon E5-2670 v3 CPU. We choose 48 sets of matrices to be used as a benchmark suite. Experimental results show that CSR2 has a remarkable performance improvement compared with CSR5 and MKL. Compared to CSR5, CSR2 can achieve an average acceleration of 1.401 × (up to 1.861 ×). Compared to MKL, CSR2 can achieve an average acceleration of 1.261 × (up to 5.921 ×). In reality, for applications with multiple iterations, using our CSR2 can bring low-overhead format conversion and high-throughput computing performance.
更多
查看译文
关键词
SpMV, SIMD vectorization, CSR2, CSR5, MKL, Storage format
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要