Chrome Extension
WeChat Mini Program
Use on ChatGLM

Gsi: A Gpu Stall Inspector To Characterize The Sources Of Memory Stalls For Tightly Coupled Gpus

2016 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE ISPASS 2016(2016)

Cited 8|Views119
No score
Abstract
In recent years the power wall has prevented the continued scaling of single core performance. This has lead to the rise of dark silicon and motivated a move toward parallelism and specialization. As a result, energy-efficient high-throughput GPU cores are increasingly favored for accelerating data-parallel applications. However, the best way to efficiently communicate and synchronize across heterogeneous cores remains an important open research question. Many methods have been proposed to improve the efficiency of heterogeneous memory systems, but current methods for evaluating the performance effects of these innovations are limited in their ability to attribute differences in execution time to sources of latency in the memory system. Performance characterization of tightly coupled CPU-GPU systems is complicated by the high levels of parallelism present in GPU codes. Existing simulation tools provide only coarse-grained metrics which can obscure the underlying memory system interactions that cause performance differences. In this work we introduce GPU Stall Inspector (GSI), a method for identifying and visualizing the causes of GPU stalls with a focus on a tightly coupled CPU-GPU memory subsystem. We demonstrate the utility of our approach by evaluating the sources of stalls in several recent architectural innovations for tightly coupled, heterogeneous CPU-GPU systems.
More
Translated text
Key words
GSI,GPU stall inspector,memory stalls,power wall,single core performance,energy-efficient high-throughput GPU cores,data-parallel applications,heterogeneous cores,heterogeneous memory systems,performance characterization,parallelism,GPU codes,coarse-grained metrics,CPU-GPU memory subsystem,heterogeneous CPU-GPU systems,graphics processing unit,central processing unit
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined