DrGPU: A Top-Down Profiler for GPU Applications

Yueming Hao,Nikhil Jain, Rob Van der Wijngaart,Nirmal Saxena, Yuanbo Fan,Xu Liu

ICPE '23: Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering(2023)

引用 0|浏览32
暂无评分
摘要
GPUs have become common in HPC systems to accelerate scientific computing and machine learning applications. Efficiently mapping these applications to rapid evolutions of GPU architectures for high performance is a well-known challenge. Various performance inefficiencies exist in GPU kernels that impede applications from obtaining bare-metal performance. While existing tools are able to measure these inefficiencies, they mostly focus on data collection and presentation, requiring significant manual efforts to understand the root causes for actionable optimization. Thus, we develop DrGPU, a novel profiler that performs top-down analysis to guide GPU code optimization. As its salient feature, DrGPU leverages hardware performance counters available in commodity GPUs to quantify stall cycles, decompose them into various stall reasons, pinpoint root causes, and provide intuitive optimization guidance. With the help of DrGPU, we are able to analyze important GPU benchmarks and applications and obtain nontrivial speedups --- up to 1.77X on V100 and 2.03X on GTX 1650.
更多
查看译文
关键词
drgpu,profiler,top-down
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要