CNN hardware acceleration on a low-power and low-cost APSoC

2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)(2019)

Cited 5|Views59
No score
Abstract
Deep learning and Convolutional Neural Networks (CNNs) in particular, are currently one of the most promising and widely used classes of algorithms in the field of artificial intelligence, being employed in a wide range of tasks. However, their high computational complexity and storage demands limit their efficient deployment on resource-limited embedded systems and IoT devices. To address this problem, in recent years a wide landscape of customized FPGA-based hardware acceleration solutions has been presented in literature, focused on combining high performance and power efficiency. Most of them are implemented on mid- to high-range devices including different computing cores, and target intensive models such as AlexNet and VGG16. In this work, we implement a CNN inference accelerator on a compact and cost-optimized device, the Minized development board from Avnet, integrating a single-core Zynq 7Z007S. We measure the execution time and energy consumption of the developed accelerator, and we compare it with a CPU-based software implementation. The results show that the accelerator achieves a frame rate of 13 fps on the end-to-end execution of ALL-CNN-C model, and 4 fps on DarkNet. Compared with the software implementation, it was 5 times faster providing up to 10.62 giga operations per second (GOPS) at 80 MHz while consuming 1.08 W of on-chip power.
More
Translated text
Key words
convolution neural networks,FPGAs,hardware accelerators
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined