SP-PIM: A 22.41TFLOPS/W, 8.81Epochs/Sec Super-Pipelined Processing-In-Memory Accelerator with Local Error Prediction for On-Device Learning.

VLSI Technology and Circuits(2023)

引用 0|浏览14
暂无评分
摘要
This paper presents SP-PIM that demonstrates real-time on-device learning based on the holistic, multi-level pipelining scheme enabled by local error prediction. It introduces the local error prediction unit to make the training algorithm pipelineable, while reducing computation overhead and overall external memory access based on power-of-two arithmetic operations and random weights. Its double-buffered PIM macro is designed for performing both forward propagation and gradient calculation, while the dual-sparsity-aware circuits exploit sparsity in activation and error. Finally, the 5.76mm 2 SP-PIM chip fabricated in 28nm process achieves 8.81Epochs/Sec model training on chip with the state-of-the-art 560.6GFLOPS/mm 2 area efficiency and 22.4TFLOPS/W power efficiency.
更多
查看译文
关键词
Training,On-device learning,HW accelerator,PIM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要