NeuroSpector: Systematic Optimization of Dataflow Scheduling in DNN Accelerators

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS(2023)

引用 0|浏览2
暂无评分
摘要
This paper presents an optimization framework named NeuroSpector that systematically analyzes the dataflow of deep neural network (DNN) accelerators and rapidly identifies optimal execution methods. The proposed methodology is demonstrated to work effectively with a variety of accelerator architectures and DNN workloads. It has been a baffling challenge to devise scheduling schemes for neural accelerators to maximize energy efficiency and performance. The challenge lies in that hardware specifications associated with multi-dimensional DNN data create an enormous number of possible scheduling options that can be exerted on accelerators. Related work suggested various techniques to solve the challenge encompassing brute-force search of massive solution spaces pruned by user constraints, solving the objective functions of system models, learning-based optimization, etc. However, each suggested technique was devised only for a specific accelerator model. Therefore, we find that they are not adaptively applicable to different accelerators and DNN workloads in that they produce hit-or-miss results with 100.1% greater energy and cycles on average compared to optimal scheduling schemes obtained from fully comprehensive brute-force searches. In contrast, NeuroSpector identifies efficient execution methods for various accelerators and workloads with only 1.5% differences on average to the optimal scheduling solutions. The optimization strategy of NeuroSpector is based on an observation that optimal executions are strongly correlated with minimizing data movements to the lower-level memory hierarchy of accelerators rather than maximizing the utilization of processing elements. Thus, NeuroSpector prioritizes optimizing lower-level components in the accelerator hierarchy, which is proven highly effective for various accelerators and DNN workloads.
更多
查看译文
关键词
Optimal scheduling,System-on-chip,Neural networks,Scheduling,Random access memory,Processor scheduling,Correlation,Deep neural network,accelerator,dataflow,mapping,scheduling,optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要