FLEX-CIM: A Flexible Kernel Size 1-GHz 181.6-TOPS/W 25.63-TOPS/mm $^{2}$ Analog Compute-in-Memory Macro

IEEE Journal of Solid-State Circuits(2024)

引用 0|浏览0
暂无评分
摘要
Compute-in-memory (CIM) is a promising approach for realizing energy-efficient convolutional neural network (CNN) accelerators. Previous CIM works demonstrated a high peak energy efficiency of over 100 TOPS/W, with larger fabrics of 1000 $+$ channels. Yet, they typically suffer from low utilization for small CNN layers (e.g., $\sim$ 9% for ResNet-32). It penalizes their average energy efficiency, throughput density, and effective memory size by the utilization rate. In addition, the analog-to-digital converter (ADC) occupies most of their computing time ( $\sim$ 90%), further hindering the CIM’s throughput. This work presents an FLEX-CIM fabricated under 28-nm CMOS featuring: 1) an analog partial sum (APS) circuit to enable a flexible CIM Kernel size; 2) an overclocked fast multiply–accumulate array (FMA) to boost the throughput; and 3) an adaptive-resolution ADC to enhance the throughput and energy efficiency. The achieved utilization is 99.2% on ResNet-32. Under 4-bit MAC precision, the peak energy efficiency is 181.6 TOPS/W, and the peak throughput density is 25.63 TOPS/mm $^{2}$ .
更多
查看译文
关键词
Analog partial sum (APS),compute-in-memory (CIM),convolutional neural network (CNN),flexible kernel size,utilization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要