DHTS: A Dynamic Hybrid Tiling Strategy for Optimizing Stencil Computation on GPUs

Song Liu, Zengyuan Zhang,Weiguo Wu

IEEE Transactions on Computers(2023)

引用 0|浏览12
暂无评分
摘要
Stencil computation is an important class of computational modes in scientific computing applications. Loop tiling techniques have been widely studied to accelerate stencil computations on different architectures by exploiting parallelism and data locality. Recent advanced tiling methods enable the tile-wise concurrent start-up to improve the execution performance. However, such methods statically partition all dimensions of iteration space into tiles with predetermined complex shapes and sizes, and thus lead to low thread utilization and memory access efficiency on GPUs. In this paper, we present DHTS, a novel dynamic hybrid tiling strategy for stencil computations. DHTS employs static tiling on the outer dimensions to achieve concurrent start-up parallelism, while proposes a dynamic rectangular tiling method on the inner dimensions to improve thread utilization and memory access efficiency. By deriving tile size constraints, DHTS adaptively achieves equal-size workload of tiles, and therefore reducing idle threads and increasing coalesced memory accesses within tiles. We implement the proposed strategy with different complex tile shapes. Experimental results on Titan V and Tesla V100 GPUs show that DHTS effectively improves the execution performance of 2D/3D stencils compared to state-of-the-art tiling methods, and achieves the best improvement of 28×.
更多
查看译文
关键词
optimizing stencil computation,dynamic hybrid tiling strategy,gpus
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要