Chrome Extension
WeChat Mini Program
Use on ChatGLM

An RRAM-Based 40.6 TOPS/W Energy-Efficient AI Inference Accelerator with Quad Neuromorphic-Processor-Unit for Highly Contrast Recognition

Y.L. Lin, Y. R. Liu,T. C. Kao, M. Y. Lee,J. C. Guo, T. -H. Hou,Steve S. Chung

2024 International VLSI Symposium on Technology, Systems and Applications (VLSI TSA)(2024)

Cited 0|Views2
No score
Abstract
We present a non-volatile edge deep neural network accelerator with a resistive-gate FinFET (RG-FinFET) memory and a parallel processor for edge AI inference. The RG-FinFET has the potential for 8-level operation. In the system, data storage and multiplication are carried out in the RG-FinFET array, and all the other operations are performed in a 4-core neuromorphic processing units (NPU). Quantization error is introduced into training stage through ex-Situ quantized training method, thus, the accuracy can still reach 97.24% and 80.18% respectively for MNIST and CIFAR-10 datasets while the parameter capacity is nearly 8x smaller. Eventually, the system's computation efficiency with 40.6 TOPS/w can be achieved, which is well-suited for the end-to-end integer-only AI-Inference hardware in CIM.
More
Translated text
Key words
Inference Acceleration,AI Inference,Neural Network,Quantization Error,MNIST Dataset,Image Recognition,Floating-point Operations,Edge Devices,Inference Stage,Read Operation,Memory Array,Sign Bit
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined