An Optimized Error-controlled MPI Collective Framework Integrated with Lossy Compression
2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)(2023)
Abstract
With the ever-increasing computing power of supercomputers and the growing
scale of scientific applications, the efficiency of MPI collective
communications turns out to be a critical bottleneck in large-scale distributed
and parallel processing. The large message size in MPI collectives is
particularly concerning because it can significantly degrade the overall
parallel performance. To address this issue, prior research simply applies the
off-the-shelf fix-rate lossy compressors in the MPI collectives, leading to
suboptimal performance, limited generalizability, and unbounded errors. In this
paper, we propose a novel solution, called C-Coll, which leverages
error-bounded lossy compression to significantly reduce the message size,
resulting in a substantial reduction in communication cost. The key
contributions are three-fold. (1) We develop two general, optimized
lossy-compression-based frameworks for both types of MPI collectives
(collective data movement as well as collective computation), based on their
particular characteristics. Our framework not only reduces communication cost
but also preserves data accuracy. (2) We customize SZx, an ultra-fast
error-bounded lossy compressor, to meet the specific needs of collective
communication. (3) We integrate C-Coll into multiple collectives, such as
MPI_Allreduce, MPI_Scatter, and MPI_Bcast, and perform a comprehensive
evaluation based on real-world scientific datasets. Experiments show that our
solution outperforms the original MPI collectives as well as multiple baselines
and related efforts by 1.8-2.7X.
MoreTranslated text
Key words
Lossy Compression,MPI Collective,Distributed Systems
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined