A 16.38TOPS and 4.55POPS/W SRAM Computing-in-Memory Macro for Signed Operands Computation and Batch Normalization Implementation

IEEE Transactions on Circuits and Systems I: Regular Papers(2024)

引用 0|浏览6
暂无评分
摘要
Edge artificial intelligence applications impose rigorous demands on local hardware to improve throughput and energy efficiency. Computing-in-memory (CIM) architectures provide high parallel and energy-efficient solutions to accelerate the multiply-and-accumulate (MAC) operations in neural networks (NNs). While SRAM-based charge-domain CIM is achieving thousands of TOPS/W energy efficiency, it encounters limitations when dealing with full NN model deployments where both activations and weights are signed. This paper proposes an SRAM-based signed batch normalization (BN) CIM macro for supporting efficient bitwise sparse MAC computation with signed operands and BN operations in deep neural networks. The key features of this macro encompass: 1) a multibit weight unit for the optimization of bitstream sparsity and the sign bit computation, 2) a 2b-serial input configuration to increase throughput and the ADC energy amortization, and 3) a quantization-hardware co-design for the BN implementation. Measurement results show that the proposed 28 nm 64 Kb CIM macro achieves 16.38 TOPS throughput and 4.55 POPS/W energy efficiency, both normalized to 1b operands. The test accuracy of CIFAR10 is 92%, based on the ResNet18 model with co-design BN implementation at signed-8b precision activations and weights.
更多
查看译文
关键词
Computing-in-memory,SRAM,neural network,signed operation,batch normalization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要