Super Tiny Language Models
CoRR(2024)
摘要
The rapid advancement of large language models (LLMs) has led to significant
improvements in natural language processing but also poses challenges due to
their high computational and energy demands. This paper introduces a series of
research efforts focused on Super Tiny Language Models (STLMs), which aim to
deliver high performance with significantly reduced parameter counts. We
explore innovative techniques such as byte-level tokenization with a pooling
mechanism, weight tying, and efficient training strategies. These methods
collectively reduce the parameter count by 90% to 95% compared to
traditional models while maintaining competitive performance. This series of
papers will explore into various subproblems, including tokenizer-free models,
self-play based training, and alternative training objectives, targeting models
with 10M, 50M, and 100M parameters. Our ultimate goal is to make
high-performance language models more accessible and practical for a wide range
of applications.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要