OpTree: An Efficient Algorithm for All-gather Operation in Optical Interconnect Systems

ICC 2023 - IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS(2023)

引用 0|浏览0
暂无评分
摘要
All-gather collective communication is one of the most important communication primitives in parallel and distributed computation, which plays an essential role in many high performance computing (HPC) applications such as distributed Deep Learning (DL) with model and hybrid parallelisms. To solve the communication bottleneck of All-gather, optical interconnection network can provide unprecedented high bandwidth and reliability for data transfer among the distributed nodes. However, most traditional All-gather algorithms are designed for electrical interconnection, which cannot fit well for optical interconnect systems, resulting in poor performance. This paper proposes an efficient scheme, called OpTree, for All-gather operation on optical interconnect systems. OpTree derives an optimal m-ary tree corresponding to the optimal number of communication stages, which achieves the minimum communication time. We further analyze and compare the communication steps of OpTree with existing All-gather algorithms. Theoretical results exhibit that OpTree requires much less number of communication steps than existing All-gather algorithms on optical interconnect systems. Simulation results show that OpTree can reduce communication time by 72.21%, 94.30%, and 88.58% compared to three existing All-gather schemes WRHT, Ring, and NE, respectively.
更多
查看译文
关键词
Optical interconnects,All-gather,communication,WDM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要