DNNOPT: A Framework for Efficiently Selecting On-chip Memory Loop Optimizations of DNN Accelerators.

Piyumal Ranawaka,Muhammad Waqar Azhar,Per Stenström

ACM International Conference on Computing Frontiers（2024）

引用 0|浏览0

暂无评分

摘要

Deep neural network (DNN) accelerators suffer from poor utilization of on-chip memory which potentially reduces performance and energy efficiency. Loop reordering and blocking are used to improve on-chip memory utilization in DNN accelerators. However, existing optimization frameworks are inefficient due to either a prohibitive time complexity of searching the entire search space or due to a sub-optimal choice of optimizations. This paper proposes DNNOPT - ahardware/software framework for optimally selecting loop order and blocking factors, for loop reordering and blocking in isolation or in combination. DNNOPT uses proposed Early exit and Strided search strategies to prune the search space and simple analytical models of data reuse to evaluate each optimization point efficiently and accurately. Overall, DNNOPT reduces the search space by more than two orders of magnitude and improves performance, energy efficiency and time to solution, on average, by 1.8×, 50%, and 226×, respectively, of convolutional neural network (CNN) and Transformer applications compared to state-of-the-art frameworks.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要