Cache-emulated register file: An integrated on-chip memory architecture for high performance GPGPUs.

MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture Taipei Taiwan October, 2016(2016)

引用 27|浏览121
暂无评分
摘要
The on-chip memory design is critical to the GPGPU performance because it serves between the massive threads and the huge external memory as a low-latency and high-throughput data communication point. However, the existing on-chip memory hierarchy is inherited from the conventional CPU architecture and is oftentimes sub-optimal to the SIMT (single instruction, multiple threads) execution. In this study, we surpass the traditional memory hierarchy design and reform the on-chip memory into an integrated architecture with the cache-emulated register file (RF) capability tailored for high performance GPGPU computing. With the lightweight support from ISA, compiler and the modified microarchitecture, this integrated architecture can dynamically emulate a variable-sized RF and a cache in a uniform way. Evaluation results demonstrate that this novel architecture can deliver better performance and energy efficiency with smaller on-chip memory size. For example, it can gain an average of 50% performance improvement for the cache-sensitive applications.
更多
查看译文
关键词
cache-emulated register file,integrated on-chip memory architecture,on-chip memory design,CPU architecture,high performance GPGPU computing,microarchitecture
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要