Timing Error Tolerant CNN Accelerator With Layer-Wise Approximate Multiplication
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(2024)
Abstract
Exploiting the error tolerance in computation, approximate circuits become an emerging computing paradigm to increase the energy efficiency in digital systems, which is crucial in high-performance and low-power systems for the edge Internet-of-Things (EIoT) devices. Inspired by the state-of-the-art high-efficiency NN accelerators, three techniques are proposed for effectively integrating the approximate computing unit into CNN accelerator to achieve a dynamic energy-accuracy trade-off: (1) An approximate multiplier that can be configured to three precision modes is proposed. A weight pre-encoding method is used to save hardware overhead. (2) For hybrid-accuracy layer-wise mapping, the hessian-aware layer-wise accuracy scaling is proposed, which concerns inference accuracy and hardware overhead simultaneously. A progressive re-training approach is proposed to enable an aggressive approximation configuration and higher power reduction. (3) A tensor multiplication unit (TMU) with timing error detection and correction (TEDC) approach is proposed, enabling an aggressive voltage scaling and a 41.5% power reduction is obtained. An energy-efficient CNN accelerator is proposed and shows how deep learning can be brought to EIoT devices by running each layer at its appropriate computational accuracy. Implemented under 28-nm CMOS technology, the CNN accelerator achieves the energy efficiency of 14.4 TOPS/W. The proposed accelerator and method are conducted on the applications of keyword spotting of GSCD, CIFAR10 and CIFAR100, 44.5%~46.7% multiplication energy is saved while reducing the accuracy by less than 0.6%.
MoreTranslated text
Key words
Convolutional neural network,configurable approximate multiplier,layer-wise accuracy scaling and recovery,timing error detection and correction,hardware accelerator
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined