Calculation of Cross-correlation Function Accelerated by Tensor Cores with TensorFloat-32 Precision on Ampere GPU

COMPUTATIONAL SCIENCE, ICCS 2022, PT II(2022)

引用 1|浏览0
暂无评分
摘要
The cross-correlation function appears in many fields with time-series data, and speeding up the computation is essential given the recent accumulation of significant amounts of data. The cross-correlation function can be calculated as a matrix-matrix product, and a significant speed-up can be expected utilizing Tensor Core, which is a matrixmatrix product acceleration unit of the latest NVIDIA Graphics Processing Units (GPUs). In this research, we target a new precision data type called the TensorFloat-32, which is available in the Ampere architecture. We develop a fast calculation method considering the characteristics of the cross-correlation function and TensorCore. Our method achieved a very high performance of 53.56 TFLOPS in the performance measurement assuming seismic interferometry using actual data, which is 5.97 times faster than cuBLAS, a widely used linear algebra library on NVIDIA GPUs. In addition, the accuracy of the calculation result is sufficiently high compared to the 64-bit floating-point calculation, indicating the applicability of Tensor Core operations using TensorFloat-32 for scientific calculations. Our proposed method is expected to make it possible to utilize a large amount of data more effectively in many fields.
更多
查看译文
关键词
Cross-correlation function, GPU computing, Tensor core
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要