Edge-Side Fine-Grained Sparse CNN Accelerator With Efficient Dynamic Pruning Scheme

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS(2024)

Cited 0|Views9
No score
Abstract
With the rapid development of the Internet of Things (IoT), it has become a common concern of academia and industry to provide real-time high performance services for edge-side applications and to bestow intelligence on massive edge-side devices. Due to the limitations of storage space, volume and power consumption of edge side devices, it is difficult for existing convolutional neural networks with large number of parameters and large amount of computation to match them. Network pruning can effectively alleviate the excessive parameters and computation issues in CNNs. However, fine-grained pruning is not hardware friendly, while other structured pruning schemes will result in a much higher loss of accuracy under the same compression ratio. In this paper, an model compression strategy is given including the proposed efficient fine-grained pruning scheme, a dynamic pruning & training method, and a weight importance judgment method. Depending on this strategy, sparse VGG16 (ResNet50) model can be obtained by training from scratch, and achieves a total of 16x compression ratio with 1/32 indexing overhead. Further, a light-weight, high-performance sparse CNN accelerator with modified systolic array is proposed. Implementing VGG16 and ResNet50 on the proposed accelerator, the experimental results show that compared with the most advanced design, the proposed accelerator can achieve 8.13 Frames Per Second (FPS) with 2.17x better power efficiency and at most 4.14x better calculation density.
More
Translated text
Key words
Edge-side hardware acceleration,convolutional neural networks,fine-grained sparsity,field programmable gate array (FPGA)
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined