Accelerate Model Parallel Deep Learning Training Using Effective Graph Traversal Order in Device Placement

DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS (DAIS 2022)(2022)

引用 0|浏览13
暂无评分
摘要
Modern neural networks require long training to reach decent performance on massive datasets. One common approach to speed up training is model parallelization, where large neural networks are split across multiple devices. However, different device placements of the same neural network lead to different training times. Most of the existing device placement solutions treat the problem as sequential decisionmaking by traversing neural network graphs and assigning their neurons to different devices. This work studies the impact of neural network graph traversal orders on device placement. In particular, we empirically study how different graph traversal orders of neural networks lead to different device placements, which in turn affects the training time of the neural network. Our experiment results show that the best graph traversal order depends on the type of neural networks and their computation graphs features. In this work, we also provide recommendations on choosing effective graph traversal orders in device placement for various neural network families to improve the training time in model parallelization.
更多
查看译文
关键词
Device Placement, Model Parallelization, Deep Learning, Graph Traversal Order
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要