Chrome Extension
WeChat Mini Program
Use on ChatGLM

pdlADMM: An ADMM-based framework for parallel deep learning training with efficiency(star)

NEUROCOMPUTING(2021)

Cited 3|Views66
No score
Abstract
Alternating Direction Methods of Multipliers (ADMM) has been proven to be a useful alternative to the popular gradient-based optimizers and successfully applied to train the DNN model. Whereas existing ADMM-based approaches generally do not achieve a good trade-off between the rapid convergence and fast training and do not support parallel DNN training with multiple GPUs as well. These drawbacks seriously hinder them from effectively training DNN models with modern GPU computing platforms which are always equipped with multiple GPUs. In this paper, we propose pdlADMM that can effectively train DNN in a data-parallel manner. The key insight of pdlADMM lies in that it explores efficient solutions for each sub-problem by comprehensively considering three main factors including computational complexity, convergence, and suitability to parallel computing. With more number of GPUs, pdlADMM remains rapid convergence and the computational complexity on each GPU tends to decline. Extensive experiments demonstrate the effectiveness of our proposal. Compared to the other two state state-ofthe-art ADMM-based approaches, pdlADMM converges significantly faster, obtains better accuracy, and achieves very competitive training speed at the same time. (c) 2020 Published by Elsevier B.V.
More
Translated text
Key words
ADMM,DNN,Multi-GPU,Data-parallel,Efficiency
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined