Towards Pareto Optimal Throughput in Small Language Model Serving
PROCEEDINGS OF THE 2024 4TH WORKSHOP ON MACHINE LEARNING AND SYSTEMS, EUROMLSYS 2024(2024)
关键词
Language Models,Inference Optimization
AI 理解论文
溯源树
样例

生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要
PROCEEDINGS OF THE 2024 4TH WORKSHOP ON MACHINE LEARNING AND SYSTEMS, EUROMLSYS 2024(2024)