Survey on Large Scale Neural Network Training

Julia Gusak,Daria Cherniuk,Alena Shilova, Alexander Katrutsa,Daniel Bershatsky,Xunyi Zhao,Lionel Eyraud-Dubois, Oleg Shlyazhko,Denis Dimitrov,Ivan Oseledets,Olivier Beaumont

arxiv（2022）

引用 0|浏览79

暂无评分

摘要

Modern Deep Neural Networks (DNNs) require significant memory to store weight, activations, and other intermediate tensors during training. Hence, many models do not fit one GPU device or can be trained using only a small per-GPU batch size. This survey provides a systematic overview of the approaches that enable more efficient DNNs training. We analyze techniques that save memory and make good use of computation and communication resources on architectures with a single or several GPUs. We summarize the main categories of strategies and compare strategies within and across categories. Along with approaches proposed in the literature, we discuss available implementations.

查看译文

关键词

neural,training,survey,network

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要