MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
Ruidong Zhu,Ziheng Jiang,Chao Jin, Peng Wu, Cesar A. Stuardo, Dongyang Wang, Xinlei Zhang, Huaping Zhou,Haoran Wei, Yang Cheng, Jianzhe Xiao, Xinyi Zhang, Lingjun Liu,Haibin Lin, Li-Wen Chang, Jianxi Ye, Xiao Yu,Xuanzhe Liu,Xin Jin, Xin Liu arxiv(2025)
AI 理解论文
溯源树
样例
