Automatic Memory-Efficient Scheduling Of Cnns

EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2019(2019)

引用 0|浏览64
暂无评分
摘要
Accessing large external DRAM is costly, and poses a challenge to efficiently evaluate data-intensive convolutional neural networks (CNNs) on embedded devices. These external memory accesses can be minimized by exploiting data reuse in on-chip memory. Selecting the combination of code transformations that minimize the external DRAM accesses is however an extremely complex task. In this work a mathematical model is presented to quickly and very precisely evaluate combinations of code transformations on CNNs. An accompanying open source tool is developed which leverages this model to perform automated design space exploration and code generation for CNNs. The correctness of the developed model is demonstrated by measurement of seven neural networks. Results show the transformations selected by the tool can reduce external memory accesses by over an order of magnitude.
更多
查看译文
关键词
Memory efficient, Reuse, Scheduling, CNN
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要