Accelerating Large-Scale CFD Simulations with Lattice Boltzmann Method on a 40-Million-Core Sunway Supercomputer

PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023(2023)

引用 0|浏览8
暂无评分
摘要
The Lattice Boltzmann Method (LBM) has gained widespread popularity due to its applicability in fluid dynamics, chemical engineering, material science, and other domains. In this work, we present an optimized implementation of the LBM, with a specific focus on achieving superior performance and scalability on advanced heterogeneous systems such as the new Sunway supercomputer. To accomplish this, we employ several techniques, including kernel fusion to enhance temporal and spatial locality, a customized multi-level domain decomposition and data sharing scheme, and pipelining strategies that are tailored to the SW26010-Pro processor. As a result of these optimizations, we have successfully scaled our code to a total of 39,000,000 CPU cores. Our largest simulation, which encompassed over 42 trillion lattice cells, achieved an impressive 67,018 billion lattice cell updates per second (GLUPS), with 82.9% memory bandwidth utilization, and a sustained performance of 28 PFlops. In order to assess the portability of our implementation, we also adapted our code to run on a GPU cluster, utilizing a range of tailored optimization techniques. Our results demonstrated a 191x speedup, along with 83.8% memory bandwidth utilization. Our proposed approach marks a significant milestone in the field of LBM implementations, as it demonstrates unprecedented scalability by effectively utilizing over 39,000,000 cores while maintaining exceptional parallel efficiency and computational performance. This achievement establishes our method as a compelling solution for addressing large-scale computational fluid dynamics challenges on heterogeneous systems.
更多
查看译文
关键词
Lattice Boltzmann Method,Sunway Supercomputer,heterogeneous systems,parallel scalability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要