Chrome Extension
WeChat Mini Program
Use on ChatGLM

Layer-Wise Mixed-Modes CNN Processing Architecture With Double-Stationary Dataflow and Dimension-Reshape Strategy

Bo Liu, Xinxiang Huang, Yang Zhang, Guang Yang,Han Yan, Chen Zhang, Zejv Li, Yuanhao Wang,Hao Cai

IEEE Transactions on Circuits and Systems I: Regular Papers(2024)

Cited 0|Views2
No score
Abstract
With the development of convolutional neural networks (CNN) across various domains, the growth in network structure complexity and computational load has increasingly become a research focus in the deployment of neural networks. The key to current research on neural network accelerators lies in striking a balance between computational accuracy and energy efficiency. This paper proposes a software-hardware co-design to strike the balance for CNN edge applications. On the hardware side, a 3-dimensional tensor engine (3D-TE), achieved with reconfigurable Tensor Processing Units (TPUs), is introduced for efficient convolution computation. We optimize the CNN dataflow on 3D-TE using a dimension reshaping method for feature maps rearrangement, and a double stationary dataflow scheduling to reduce memory access. This paper adopts a configurable approximate multiplier design based on Boolean Matrix Factorization (BMF) based logic synthesis applied in the architecture of TPU. The proposed 3D-TE, characterized by its configurable precision, enables the TPUs to dynamically adapt the bitwidth of features and weights in response to varying precision requirements. On the software side, a hessian-guided layer precision mapping is adopted to reduce unnecessary computational overhead, and a progressive re-training approach is proposed to enable a better approximation configuration and higher power reduction. Fabricated on 28-nm CMOS, this work achieves an optimized energy efficiency of 14.9 TOPS/W and 12.1 TOPS/W for ResNet56 and MobileNetV2 respectively, with 0.6V supply voltage and 150MHz clock frequency, representing an improvement of $1.33\times\sim8.28\times$ over the state-of-the-art works.
More
Translated text
Key words
Approximate computing,convolutional neural network (CNN),double stationary dataflow,dimension reshape strategy,mixed-modes accelerator
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined