Collaborative Coalescing of Redundant Memory Access for GPU System

2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)(2024)

引用 0|浏览0
暂无评分
摘要
GPU-based computing serves as the primary solution driving the performance of HPC systems. However, modern GPU systems encounter performance bottlenecks resulting from heavy memory access traffic and insufficient NoC bandwidth. In this work, we propose a collaborative coalescing mechanism aimed at eliminating redundant memory access and boosting GPU system performance. To achieve this, we design a coalescing unit for each memory partition, effectively merging requests from both inter-cluster and intra-cluster SMs. Additionally, we introduce a hierarchical multicast module to replicate and distribute the coalesced reply messages to multiple destination SMs. Experimental results show that our method achieves 20.6% improvement on performance and 27.1% reduction on NoC traffic over the baseline.
更多
查看译文
关键词
GPU Systems,Redundant Memory,High-performance Computing Systems,Time Window,Energy Conservation,Data Sharing,Memory Control,Output Ports,Input Port,Caching,Hardware Cost,Packet Delivery,Ring Topology,Request Message,Unit Of Replication,Address Space,L2 Cache,Packet Header
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要