Chrome Extension
WeChat Mini Program
Use on ChatGLM

Effective Resource-Driven Loop Splitting for Large Unstructured Mesh Applications on GPUs

semanticscholar(2015)

Cited 0|Views6
No score
Abstract
Unstructured mesh applications are widely used in science and industry for simulating phenomena as diverse as turbomachinery components of jet engines and blood flow in arteries. These are examples of irregular applications that are difficult to optimize for accelerator targets such as GPUs. Splitting loops is a standard technique used for optimizing GPU applications. It breaks down large complex parallel loops into smaller units whose performance is improved, due to reduced shared memory and register requirements. In this paper we introduce a general loop splitting methodology for unstructured meshes, which is able to split a complex loop into multiple simpler loops. A given loop can be split in different ways, depending on the loop features and the target GPU hardware. Unlike previous contributions, the introduced technique permits synthesizing alternative implementation strategies, without the need of transforming the input program. Experiments on a series of complex loops from an industrial CFD code show the efficacy of our solution both for NVidia Fermi GPUs and Intel multicore CPUs. The results show that the version obtained after loop splitting always performs better on a GPU compared to the original version. The opposite result is instead obtained for the CPU, as the original unsplit version performs better when using large numbers of threads.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined