A Precision-Optimized Fixed-Point Near-Memory Digital Processing Unit for Analog In-Memory Computing
CoRR(2024)
摘要
Analog In-Memory Computing (AIMC) is an emerging technology for fast and
energy-efficient Deep Learning (DL) inference. However, a certain amount of
digital post-processing is required to deal with circuit mismatches and
non-idealities associated with the memory devices. Efficient near-memory
digital logic is critical to retain the high area/energy efficiency and low
latency of AIMC. Existing systems adopt Floating Point 16 (FP16) arithmetic
with limited parallelization capability and high latency. To overcome these
limitations, we propose a Near-Memory digital Processing Unit (NMPU) based on
fixed-point arithmetic. It achieves competitive accuracy and higher computing
throughput than previous approaches while minimizing the area overhead.
Moreover, the NMPU supports standard DL activation steps, such as ReLU and
Batch Normalization. We perform a physical implementation of the NMPU design in
a 14 nm CMOS technology and provide detailed performance, power, and area
assessments. We validate the efficacy of the NMPU by using data from an AIMC
chip and demonstrate that a simulated AIMC system with the proposed NMPU
outperforms existing FP16-based implementations, providing 139×
speed-up, 7.8× smaller area, and a competitive power consumption.
Additionally, our approach achieves an inference accuracy of 86.65
with an accuracy drop of just 0.12
benchmarked with ResNet9/ResNet32 networks trained on the CIFAR10/CIFAR100
datasets, respectively.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要