Vectorizing Sparse Matrix Computations with Partially-Strided Codelets
SC22: International Conference for High Performance Computing, Networking, Storage and Analysis(2022)
摘要
The compact data structures and irregular computation patterns in sparse matrix computations introduce challenges to vectorizing these codes. Available approaches primarily vec-torize strided computation regions of a sparse code. In this work, we propose a locality-based codelet mining (LCM) algorithm that efficiently searches for strided and partially strided regions in sparse matrix computations for vectorization. We also present a classification of partially strided codelets and a differentiation-based approach to generate codelets from memory accesses in the sparse computation. LCM is implemented as an inspector-executor framework called LCM I/E that generates vectorized code for the sparse matrix-vector multiplication (SpMV), sparse matrix times dense matrix (SpMM), and sparse triangular solver (SpTRSV). LCM I/E outperforms the MKL library with an average speedup of 1.67x, 4.1x, and 1.75x for SpMV, SpTRSV, and SpMM, respectively. It is also faster than the state-of-the-art inspector-executor framework Sympiler [1] for the SpTRSV kernel with an average speedup of 1.9 x.
更多查看译文
关键词
Vectorization,Parallel programming,Polyhedral analysis,Sparse matrix computation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要