The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks

Cameron Shinn, Collin McCarthy,Saurav Muralidharan,Muhammad Osama,John D. Owens

CoRR(2023)

引用 0|浏览17
暂无评分
摘要
We introduce the Sparsity Roofline, a visual performance model for evaluating sparsity in neural networks. The Sparsity Roofline jointly models network accuracy, sparsity, and predicted inference speedup. Our approach does not require implementing and benchmarking optimized kernels, and the predicted speedup is equal to what would be measured when the corresponding dense and sparse kernels are equally well-optimized. We achieve this through a novel analytical model for predicting sparse network performance, and validate the predicted speedup using several real-world computer vision architectures pruned across a range of sparsity patterns and degrees. We demonstrate the utility and ease-of-use of our model through two case studies: (1) we show how machine learning researchers can predict the performance of unimplemented or unoptimized block-structured sparsity patterns, and (2) we show how hardware designers can predict the performance implications of new sparsity patterns and sparse data formats in hardware. In both scenarios, the Sparsity Roofline helps performance experts identify sparsity regimes with the highest performance potential.
更多
查看译文
关键词
sparse,sparsity roofline,neural networks,hardware limits
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要