Advancing OpenMP Offload Debugging Capabilities in LLVM.

50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOP PROCEEDINGS - ICPP WORKSHOPS '21(2021)

引用 1|浏览1
暂无评分
摘要
Debugging an application is famously twice as hard as writing the application in the first place. While this sentiment predates modern GPU programming by decades, it is all the more true when the application has to manage computation and memory across different architectures, memory spaces, and execution modes. Any subtle error, whether in the application, the compiler, or runtime system, can lead to unexpected behavior that is hard to understand from the program output alone. While some tooling solutions for GPU debugging exist, their maturity and usefulness varies gravely between vendors. Furthermore, as OpenMP offloading puts an abstraction layer between the programmer and the underlying hardware, the information from a native GPU driver (debugging tool) is not always transferable to the OpenMP programming model. As the OpenMP Tooling [12] (OMPT) and Debug [4] (OMPD) interfaces are still not ready to debug OpenMP offloading code in production, developers have a hard time to comprehend the implementation state, error sources, and interplay of the OpenMP world with the foreign device runtimes, e.g., CUDA. In this work we present our efforts towards better debug information for OpenMP offloading through the LLVM compiler. Our developer-centric extensions allow us to inspect the state of the OpenMP offloading runtimes and their interaction with the foreign runtimes. In contrast to partial solutions employed before, we provide the ability to select the reported information through a fine-granular and flexible system without introducing substantial overheads. Furthermore, we present a software-managed stack trace feature for OpenMP offload kernels that works even without the notoriously fragile debug information in the GPU code.
更多
查看译文
关键词
LLVM,OpenMP,GPU,accelerator offloading,debugging
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要