An Empirical Study of HPC Workloads on Huawei Kunpeng 916 Processor

2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS)(2019)

Cited 10|Views27
No score
Abstract
The ARM-based server processors have been gaining momentum in high performance computing (HPC). While not designed specifically for HPC, Huawei Kunpeng 916 processor has 32 ARMv8 cores and is tempting for HPC workloads. However, its potential remains unknown. To throughly understand the potential, we conducted a systematic evaluation in three steps by using: 1) three well-known benchmarks (HPL, STREAM, and LMbench); 2) three typical scientific kernels (SpMV, N-body, and GEMM); 3) three widely used mini-apps (TeaLeaf, Neutral, and SNAP) and a real-world application GTC-P. We compared the performance results of Kunpeng 916 with that of Intel Xeon E5-2680v3/4 (Haswell/Broadwell). The evaluation results show that Kunpeng 916 has higher memory bandwidth than the two Intel processors, thus it can achieve compelling performance for running memory bound HPC applications.
More
Translated text
Key words
ARM,HPC,Huawei,Kunpeng,Benchmark,Performance Optimization,GTC-P
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined