An Emerging NVM CIM Accelerator With Shared-Path Transpose Read and Bit-Interleaving Weight Storage for Efficient On-Chip Training in Edge Devices

Zhiwang Guo,Deyang Chen,Chenyang Zhao,Jinbei Fang,Jingwen Jiang,Yixuan Liu,Haidong Tian,Xiankui Xiong,Keji Zhou,Xiaoyong Xue,Qi Liu,Xiaoyang Zeng

IEEE Transactions on Circuits and Systems II: Express Briefs（2023）

引用 0|浏览37

暂无评分

摘要

Computing-in-memory (CIM) helps to improve the energy efficiency of computing by reducing data movement. In edge devices, it is necessary for CIM accelerators to support light-weighted on-chip training for adapting the model to environmental changes and ensuring edge data security. However, most of the previous CIM accelerators for edge devices only realize inference but with training performed on cloud. The support for on-chip training will lead to remarkable area cost and serious performance attenuation. In this brief, a CIM accelerator based on emerging nonvolatile memory (NVM) is presented with shared-path transpose read and bit-interleaving weight storage for efficient on-chip training in edge devices. The shared-path transpose read employs a new biasing scheme to eliminate the influence of body effect on the transpose read, improving both read margin and speed. The bit-interleaving weight storage splits the multi-bit weights into individual bits which are stored in the array alternately, speeding up the calculation of training process remarkably. For 8-bit inputs and weights, the evaluation in the 28nm process shows that the proposed accelerator achieves ~3.34/3.06 TOPS/W energy efficiency for feed-forward/ back-propagation, 4.6X lower computing latency, and reduces at least 20% chip size compared to the baseline design.

查看译文

关键词

Memristor,computing-in-memory,on-chip training,transpose read,interleaving storage

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要