Transcending Scaling Laws with 0.1% Extra Compute.Yi Tay,Jason Wei,Hyung Chung,Vinh Tran,David So,Siamak Shakeri,Xavier Garcia,Steven Zheng,Jinfeng Rao,Aakanksha Chowdhery,Denny Zhou,Donald Metzler,Slav Petrov,Neil Houlsby,Quoc Le,Mostafa DehghaniEMNLP 2023(2023)引用 89|浏览633关键词language models,scaling laws,emergent abilities,efficiency,pretrainingAI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要