Evaluating the Performance of the hipSYCL Toolchain for HPC Kernels on NVIDIA V100 GPUs

IWOCL '20: International Workshop on OpenCL Munich Germany April, 2020(2020)

Cited 14|Views1
No score
Abstract
Future HPC leadership computing systems for the United States Department of Energy will utilize GPUs for acceleration of scientific codes. These systems will utilize GPUs from various vendors which places a large focus on the performance portability of the programming models used by scientific application developers. In the HPC domain, SYCL is an open C++ standard for heterogeneous computing that is gaining support. This is fueling a growing interest in understanding the performance of SYCL toolchains for the various GPU vendors. In this paper, we compare the performance of benchmarks and mini-apps having both SYCL and native CUDA implementations on an NVIDIA Volta GPU. We utilize the RAJA Performance Suite to evaluate the performance of the hipSYCL toolchain, followed by a more detailed investigation of the performance of two HPC mini-apps. We find that the kernel performance from the SYCL kernels compiled directly to CUDA perform at a competitive level with their CUDA counterparts when comparing the straightforward implementations.
More
Translated text
Key words
hpc kernels,nvidia v100 gpus,hipsycl toolchain,performance
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined