Chrome Extension
WeChat Mini Program
Use on ChatGLM

Predictable Scale: Part I – Optimal Hyperparameter Scaling Law in Large Language Model Pretraining

Houyi Li, Wenzhen Zheng, Qiufeng Wang, Hanshan Zhang, Zili Wang, Shijie Xuyang, Yuantao Fan,Shuigeng Zhou,Xiangyu Zhang,Daxin Jiang

CoRR(2025)

Cited 0|Views11
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined