Deep neural networks compiler for a trace-based accelerator (short WIP paper).

LCTES(2018)

引用 1|浏览63
暂无评分
摘要
Deep Neural Networks (DNNs) are the algorithm of choice for image processing applications. DNNs present highly parallel workloads that lead to the emergence of custom hardware accelerators. Deep Learning (DL) models specialized in different tasks require a programmable custom hardware and a compiler/mapper to efficiently translate different DNNs into an efficient dataflow in the accelerator. The goal of this paper is to present a compiler for running DNNs on Snowflake, which is a programmable hardware accelerator that targets DNNs. The compiler correctly generates instructions for various DL models: AlexNet, VGG, ResNet and LightCNN9. Snowflake, with a varying number of processing units, was implemented on FPGA to measure the compiler and Snowflake performance properties upon scaling up. The system achieves 70 frames/s and 4.5 GB/s of off-chip memory bandwidth for AlexNet without linear layers on Xilinx’s Zynq-SoC XC7Z045 FPGA.
更多
查看译文
关键词
Compiler, DNN, accelerator
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要