谷歌浏览器插件
订阅小程序
在清言上使用

H3DAtten: Heterogeneous 3-D Integrated Hybrid Analog and Digital Compute-in-Memory Accelerator for Vision Transformer Self-Attention

IEEE transactions on very large scale integration (VLSI) systems(2023)

引用 0|浏览10
暂无评分
摘要
After the success of the transformer networks on natural language processing (NLP), the application of transformers to computer vision (CV) has followed suit to deliver unprecedented performance gains on vision tasks, including image recognition and object detection. The multihead self-attention (MHSA) is the key component in transformers, allowing the models to learn the amount of attention paid to each input position. Despite its strong modeling capability, MHSA involves complex operations that make transformers prohibitively costly for hardware deployment. Existing acceleration efforts with conventional hardware platforms are challenged by the memory wall. To alleviate the memory wall problem, compute-in-memory (CIM) is a promising solution by storing all model parameters on-chip in compute-capable memory arrays. The footprint of 2-D CIM designs must, however, expand to accommodate the increasingly larger model sizes. In this work, we present a heterogeneous 3-D integrated (H3D) accelerator to target the MHSA workloads in vision transformers. H3D allows the proposed H3DAtten architecture to combine the merits of resistive random access memory (RRAM)-based analog CIM (ACIM) in 40 nm and static random access memory (SRAM)-based digital CIM (DCIM) in 16 nm. We perform comprehensive signaling and thermal analyses to examine the effects of 3-D stacking on the accelerator. Compared to iso-capacity 2-D baseline designs, the proposed 5-tier H3DAtten accelerator achieves $8.4\times $ compute density without experiencing accuracy loss on the ImageNet-1k dataset.
更多
查看译文
关键词
Artificial intelligence (AI) accelerator,compute-in-memory (CIM),deep learning (DL),heterogeneous 3-D integration (H3D),resistive random access memory (RRAM),vision transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要