Towards a more efficient implementation of OpenMP for clusters via translation to global arrays

Parallel Computing(2005)

Cited 37|Views0
No score
Abstract
This paper discusses a novel approach to implementing OpenMP on clusters. Traditional approaches to do so rely on Software Distributed Shared Memory systems to handle shared data. We discuss these and then introduce an alternative approach that translates OpenMP to Global Arrays (GA), explaining the basic strategy. GA requires a data distribution. We do not expect the user to supply this; rather, we show how we perform data distribution and work distribution according to the user-supplied OpenMP static loop schedules. An inspector-executor strategy is employed for irregular applications in order to gather information on accesses to potentially non-local data, group non-local data transfers and overlap communications with local computations. Furthermore, a new directive INVARIANT is proposed to provide information about the dynamic scope of data access patterns. This directive can help us generate efficient codes for irregular applications using the inspector-executor approach. We also illustrate how to deal with some hard cases containing reshaping and strided accesses during the translation. Our experiments show promising results for the corresponding regular and irregular GA codes.
More
Translated text
Key words
traditional approach,group non-local data transfer,inspector-executor approach,openmp translation,data distribution,global arrays,data access pattern,distributed memory system,irregular ga code,novel approach,global array,non-local data,efficient implementation,irregular application,parallel programming languages,alternative approach,distributed memory,data access,data transfer,parallel programming language
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined