Towards a Truly Integrated Vector Processing Unit for Memory-bound Applications Based on a Cost-competitive Computational SRAM Design Solution

ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS(2022)

引用 0|浏览13
暂无评分
摘要
This article presents Computational SRAM (C-SRAM) solution combining In- and Near-Memory Computing approaches. It allows performing arithmetic, logic, and complex memory operations inside or next to the memory without transferring data over the system bus, leading to significant energy reduction. Operations are performed on large vectors of data occupying the entire physical row of C-SRAM array, leading to high performance gains. We introduce the C-SRAM solution in this article as an integrated vector processing unit to be used by a scalar processor as an energy-efficient and high performing co-processor. We detail the C-SRAM system design on different levels: (i) circuit design and silicon proof of concept, (ii) system interface and instruction set architecture, and (iii) high-level software programming and simulation. Experimental results on two complete memory-bound applications, AES and MobileNetV2, show that the C-SRAM implementation achieves up to 70x timing speedup and 37x energy reduction compared to scalar architecture, and up to 17x timing speedup and 5x energy reduction compared to SIMD architecture.
更多
查看译文
关键词
Computational SRAM,in- and near-memory computing,simulation,AES,MobileNetV2
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要