Towards Cross-Platform Performance Portability of DNN Models using SYCL

2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)(2020)

引用 15|浏览15
暂无评分
摘要
The incoming deployment of Exascale platforms with a myriad of different architectures and co-processors have prompted the need to provide a software ecosystem based on open standards that can simplify maintaining HPC applications on different hardware. Applications written for a particular platform should be portable to a different one, ensuring performance is as close to the peak as possible. However, it is not expected that key performance routines on relevant HPC applications will be performance portable as is, especially for common building blocks such as BLAS or DNN. The oneAPI the initiative aims to tackle this problem by combining a programming model, SYCL, with a set of interfaces for common building blocks that can be optimized for different hardware vendors. In particular, oneAPI includes the oneDNN performance library, which contains building blocks for deep learning applications and frameworks. By using the SYCL programming model, it can integrate easily with existing SYCL and C++ applications, sharing data and executing collaboratively on devices with the rest of the application. In this paper, we introduce a cuDNN backend for oneDNN, which allows running oneAPI applications on NVIDIA hardware taking advantage of existing building blocks from the CUDA ecosystem. We implement relevant neural networks (ResNet-50 and VGG- 16) on native CUDA and also a version of oneAPI with a CUDA backend, and demonstrate that performance portability can be achieved by leveraging existing building blocks for the target hardware.
更多
查看译文
关键词
Cross-performance Portability,Deep Neural Network,cuDNN,oneDNN,SYCL,High Performance Computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要