Runtime Code Generation and Data Management for Heterogeneous Computing in Java

Proceedings of the Principles and Practices of Programming on The Java Platform(2015)

引用 15|浏览33
暂无评分
摘要
GPUs (Graphics Processing Unit) and other accelerators are nowadays commonly found in desktop machines, mobile devices and even data centres. While these highly parallel processors offer high raw performance, they also dramatically increase program complexity, requiring extra effort from programmers. This results in difficult-to-maintain and non-portable code due to the low-level nature of the languages used to program these devices. This paper presents a high-level parallel programming approach for the popular Java programming language. Our goal is to revitalise the old Java slogan -- Write once, run anywhere --- in the context of modern heterogeneous systems. To enable the use of parallel accelerators from Java we introduce a new API for heterogeneous programming based on array and functional programming. Applications written with our API can then be transparently accelerated on a device such as a GPU using our runtime OpenCL code generator. In order to ensure the highest level of performance, we present data management optimizations. Usually, data has to be translated (marshalled) between the Java representation and the representation accelerators use. This paper shows how marshal affects runtime and present a novel technique in Java to avoid this cost by implementing our own customised array data structure. Our design hides low level data management from the user making our approach applicable even for inexperienced Java programmers. We evaluated our technique using a set of applications from different domains, including mathematical finance and machine learning. We achieve speedups of up to 500× over sequential and multi-threaded Java code when using an external GPU.
更多
查看译文
关键词
gpgpu,code generation,algorithmic skeletons
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要