Generating efficient FFT GPU code with Lift.

FHPNC@ICFP(2019)

引用 7|浏览33
暂无评分
摘要
The Fast Fourier Transform is a well-known algorithm used in many high-performance applications, ranging from signal processing to convolutional neural networks. In this paper, we encode FFTs by building high-level abstractions based on a set of functional parallel patterns in the Lift language. Abstractions are derived from and closely resemble mathematical definitions for FFTs. We leverage the Lift performance-portable code generator to generate high performing GPU code for FFTs. No FFT-specific patterns are required for this, showing the expressive power of the generic parallel patterns used in Lift. Our experimental results show that our approach achieves performance better than AMD's OpenCL implementation clFFT on an Nvidia GPU. Nvidia's highly optimized cuFFT implementation still performs better on their GPUs.
更多
查看译文
关键词
Code Generation, FFT, GPU, Lift
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要