FLEX-CIM: A Flexible Kernel Size 1-GHz 181.6-TOPS/W 25.63-TOPS/mm $^{2}$ Analog Compute-in-Memory Macro

Yuzhao Fu,Wei-Han Yu,Ka-Fai Un,Chi-Hang Chan,Yan Zhu,Minglei Zhang,Rui P. Martins,Pui-In Mak

IEEE Journal of Solid-State Circuits（2024）

引用 0|浏览0

暂无评分

摘要

Compute-in-memory (CIM) is a promising approach for realizing energy-efficient convolutional neural network (CNN) accelerators. Previous CIM works demonstrated a high peak energy efficiency of over 100 TOPS/W, with larger fabrics of 1000

$+$

channels. Yet, they typically suffer from low utilization for small CNN layers (e.g.,

$\sim$

9% for ResNet-32). It penalizes their average energy efficiency, throughput density, and effective memory size by the utilization rate. In addition, the analog-to-digital converter (ADC) occupies most of their computing time (

$\sim$

90%), further hindering the CIM’s throughput. This work presents an FLEX-CIM fabricated under 28-nm CMOS featuring: 1) an analog partial sum (APS) circuit to enable a flexible CIM Kernel size; 2) an overclocked fast multiply–accumulate array (FMA) to boost the throughput; and 3) an adaptive-resolution ADC to enhance the throughput and energy efficiency. The achieved utilization is 99.2% on ResNet-32. Under 4-bit MAC precision, the peak energy efficiency is 181.6 TOPS/W, and the peak throughput density is 25.63 TOPS/mm

$^{2}$

查看译文

关键词

Analog partial sum (APS),compute-in-memory (CIM),convolutional neural network (CNN),flexible kernel size,utilization

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要