Judging a Type by its Pointer: Optimizing Virtual Function Calls on GPUs

semanticscholar(2020)

引用 0|浏览2
暂无评分
摘要
The programmability of parallel accelerators is a major barrier to their general adoption. Modern, complex software relies heavily on reusable, object-oriented frameworks that use inheritance and virtual functions. Although programming extensions like CUDA [2], OpenCL [17] and OpenACC [1] have expanded the subset of C++ supported on GPUs, efficiently executing object-oriented code still requires significant porting effort for both functionality and performance. To alleviate the functionality problem, we propose the first CPU/GPU allocator that enables objects with virtual functions to be allocated on the CPU then used on the GPU without programmer intervention. Using both our new allocator and legacy techniques, we perform the first study of virtual function calls and dynamic dispatch on GPUs, identifying a different set of bottlenecks than observed on CPUs. Decades of work on runtime systems, compilers and architectures for CPUs have improved the execution of object-oriented applications enough to make them commonplace [4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 18, 23, 24]. We seek to do the same for GPUs.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要