Redundancy-Reduced MobileNet Acceleration on Reconfigurable Logic for ImageNet Classification.

Jiang Su,Julian Faraone,Junyi Liu,Yiren Zhao,David B. Thomas,Philip Heng Wai Leong,Peter Y. K. Cheung

ARC（2018）

引用 55|浏览41

暂无评分

摘要

Modern Convolutional Neural Networks (CNNs) excel in image classification and recognition applications on large-scale datasets such as ImageNet, compared to many conventional feature-based computer vision algorithms. However, the high computational complexity of CNN models can lead to low system performance in power-efficient applications. In this work, we firstly highlight two levels of model redundancy which widely exist in modern CNNs. Additionally, we use MobileNet as a design example and propose an efficient system design for a Redundancy-Reduced MobileNet (RR-MobileNet) in which off-chip memory traffic is only used for inputs/outputs transfer while parameters and intermediate values are saved in on-chip BRAM blocks. Compared to AlexNet, our RR-mobileNet has 25(times ) less parameters, 3.2(times ) less operations per image inference but 9%/5.2% higher Top1/Top5 classification accuracy on ImageNet classification task. The latency of a single image inference is only 7.85 ms.

查看译文

关键词

Pruning, Quantization, CNN, FPGA, Algorithm acceleration

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要