3A-ReRAM: Adaptive Activation Accumulation in ReRAM-Based CNN Accelerator

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(2024)

引用 0|浏览7
暂无评分
摘要
ReRAM-based computing is good at accelerating convolutional neural network (CNN) inference due to its high computing parallelism, but its rigid crossbar structure may become less efficient in the face of the random data sparsity abundant in CNNs. In this study, we propose $3A$ -ReRAM, a novel crossbar architecture that can dynamically predict the accumulated results to enable adaptive activation accumulation, so that both zero and small values in feature map can be exploited in each matrix-vector multiplication (MVM) operation for speedup. To dynamically predict the results, we propose an efficient parallel predictor to find larger adapted boxes for increased computing parallelism without hurting accuracy. For a better scheduling between the dynamic predictions, we propose an efficient input window management with light-weight hardware support. With dynamic prediction and calculation, $3A$ -ReRAM architecture naturally fits the ReRAM crossbar structure but enables a totally different way to dynamically exploit the sparsity and small values in feature maps. It greatly improves the performance by increasing the computing parallelism and saves energy consumption by much less analog-digital conversions. The evaluation results show that $3A$ -ReRAM architecture can increase the performance by up to $13.03\times $ , $16.31\times $ , $2.46\times $ , and $2.58\times $ compared to ReRAM-based CNN accelerators ISAAC, PUMA (sparsity-unaware) and SRE, FORMS (sparsity-aware), and the total energy can be reduced by $8.93\times $ , $10.07\times $ , $2.97\times $ , and $4.58\times $ , respectively.
更多
查看译文
关键词
Adaptively activated accumulation,convolutional neural network (CNN),dynamic prediction,resistive-RAM,sparsity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要