Preprocessing Pipeline Optimization for Scientific Deep Learning Workloads

2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022)(2022)

引用 0|浏览17
暂无评分
摘要
Newly developed machine learning technology is promising to profoundly impact high-performance computing, with the potential to significantly accelerate scientific discoveries. However, scientific machine learning performance is often constrained by data movement overheads, particularly on existing and emerging hardware-accelerated systems. In this work, we focus on optimizing the data movement across storage and memory systems, by developing domain-specific data encoder/decoders. These plugins have the dual benefit of significantly reducing communication while enabling efficient decoding on the accelerated hardware. We explore detailed performance analysis for two important scientific learning workloads from cosmology and climate analytics, CosmoFlow and DeepCAM, on the GPU-enabled Summit and Cori supercomputers. Results demonstrate that our optimizations can significantly improve overall performance by up to 10x compared with the default baseline, while preserving convergence behavior. Overall, this methodology can be applied to various machine learning domains and emerging AI technologies.
更多
查看译文
关键词
Deep Neural Network, Pytorch, TensorFlow, Preprocessing, Compression, CosmoFlow, DeepCAM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要