Chrome Extension
WeChat Mini Program
Use on ChatGLM

Performance analysis and optimization of the JOREK code for many-core CPUs

arXiv: Performance(2018)

Cited 23|Views22
No score
Abstract
This report investigates the performance of the JOREK code on the Intel Knights Landing and Skylake processor architectures. The OpenMP scaling of the matrix construction part of the code was analyzed and improved synchronization methods were implemented. A new switch was implemented to control the number of threads used for the linear equation solver independently from other parts of the code. The matrix construction subroutine was vectorized, and the data locality was also improved. These steps led to a factor of two speedup for the matrix construction.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined