Chrome Extension
WeChat Mini Program
Use on ChatGLM

Timing Error Tolerant CNN Accelerator With Layer-Wise Approximate Multiplication

Bo Liu,Na Xie,Qingwen Wei, Guang Yang, Chonghang Xie,Weiqiang Liu,Hao Cai

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(2024)

Cited 0|Views12
No score
Abstract
Exploiting the error tolerance in computation, approximate circuits become an emerging computing paradigm to increase the energy efficiency in digital systems, which is crucial in high-performance and low-power systems for the edge Internet-of-Things (EIoT) devices. Inspired by the state-of-the-art high-efficiency NN accelerators, three techniques are proposed for effectively integrating the approximate computing unit into CNN accelerator to achieve a dynamic energy-accuracy trade-off: (1) An approximate multiplier that can be configured to three precision modes is proposed. A weight pre-encoding method is used to save hardware overhead. (2) For hybrid-accuracy layer-wise mapping, the hessian-aware layer-wise accuracy scaling is proposed, which concerns inference accuracy and hardware overhead simultaneously. A progressive re-training approach is proposed to enable an aggressive approximation configuration and higher power reduction. (3) A tensor multiplication unit (TMU) with timing error detection and correction (TEDC) approach is proposed, enabling an aggressive voltage scaling and a 41.5% power reduction is obtained. An energy-efficient CNN accelerator is proposed and shows how deep learning can be brought to EIoT devices by running each layer at its appropriate computational accuracy. Implemented under 28-nm CMOS technology, the CNN accelerator achieves the energy efficiency of 14.4 TOPS/W. The proposed accelerator and method are conducted on the applications of keyword spotting of GSCD, CIFAR10 and CIFAR100, 44.5%~46.7% multiplication energy is saved while reducing the accuracy by less than 0.6%.
More
Translated text
Key words
Convolutional neural network,configurable approximate multiplier,layer-wise accuracy scaling and recovery,timing error detection and correction,hardware accelerator
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined