基本信息
views: 16
Career Trajectory
Bio
My research focuses on foundational machine learning theory, particularly the generalization theory of large-scale methods. Addressing the lag in foundational theory compared to empirical algorithms in large-scale machine learning, I aim to uncover the underlying principles and narrow the gap between theory and practical algorithms. Ultimately, I strive to guide large-scale algorithm design for a balance between computational efficiency and generalization performance. Specific interests include:
Optimal Generalization Guarantees for Large-Scale ML: Investigating optimal generalization guarantees, relaxing assumptions, and enhancing large-scale algorithms, including federated learning, distributed learning, and random features.
Generalization Theory of Deep Neural Networks: Exploring connections between neural networks and kernel methods, studying generalization in non-stationary spectral kernel networks, refining current neural network models, and using random matrix theory to understand phenomena in deep networks.
(Future Direction) Fundamental Research on Large Language Models: Delving into the foundational theory of large language models, explaining unique capabilities like scaling laws, context learning, and complex reasoning. Improving model architecture for computational efficiency and performance and researching the next generation of efficient language models with reduced parameters.
Optimal Generalization Guarantees for Large-Scale ML: Investigating optimal generalization guarantees, relaxing assumptions, and enhancing large-scale algorithms, including federated learning, distributed learning, and random features.
Generalization Theory of Deep Neural Networks: Exploring connections between neural networks and kernel methods, studying generalization in non-stationary spectral kernel networks, refining current neural network models, and using random matrix theory to understand phenomena in deep networks.
(Future Direction) Fundamental Research on Large Language Models: Delving into the foundational theory of large language models, explaining unique capabilities like scaling laws, context learning, and complex reasoning. Improving model architecture for computational efficiency and performance and researching the next generation of efficient language models with reduced parameters.
Research Interests
Papers共 68 篇Author StatisticsCo-AuthorSimilar Experts
By YearBy Citation主题筛选期刊级别筛选合作者筛选合作机构筛选
时间
引用量
主题
期刊级别
合作者
合作机构
AAAI 2024no. 12 (2024): 13500-13508
arxiv(2024)
Applied Acoustics (2024): 110129
CoRR (2024)
Cited0Views0EIBibtex
0
0
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)pp.8566-8570, (2024)
AAAI 2024no. 20 (2024): 22704-22704
Digital Signal Processingno. C (2024): 104530
AAAI 2024no. 12 (2024): 13509-13517
IJCNNpp.1-6, (2023)
J. Mach. Learn. Res. (2023): 141:1-141:39
Cited0Views0EIBibtex
0
0
Load More
Author Statistics
Co-Author
Co-Institution
D-Core
- 合作者
- 学生
- 导师
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn