Extending OpenMP and OpenSHMEM for Efficient Heterogeneous Computing

2022 IEEE/ACM Parallel Applications Workshop: Alternatives To MPI+X (PAW-ATM)(2022)

引用 0|浏览4
暂无评分
摘要
Heterogeneous supercomputing systems are becoming mainstream thanks to their powerful accelerators. However, the accelerators’ special memory model and APIs increase the development complexity, and calls for innovative programming model designs. To address this issue, OpenMP has added target offloading for portable accelerator programming, and MPI allows transparent send-receive of accelerator memory buffers. Meanwhile, Partitioned Global Address Space (PGAS) languages like OpenSHMEM are falling behind for heterogeneous computing because their special memory models pose additional challenges.We propose language and runtime interoperability extensions for both OpenMP and OpenSHMEM to enable portable remote access on GPU buffers, with minimal amount of code changes. Our modified runtime systems work in coordination to manage accelerator memory, eliminating the need for staging communication buffers. Compared to the standard implementation, our extensions attain 6x point-to-point latency improvement, 1.3x better collective operation latency, 4.9x random access throughput, and up to 12.5% better performance in strong scaling configurations.
更多
查看译文
关键词
Heterogeneous Computing,LLVM,OpenMP,UCX,OpenSHMEM,Hybrid Programming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要