Could We Have Had Better Multilingual LLMs If English Was Not the Central Language?
arxiv(2024)
摘要
Large Language Models (LLMs) demonstrate strong machine translation
capabilities on languages they are trained on. However, the impact of factors
beyond training data size on translation performance remains a topic of debate,
especially concerning languages not directly encountered during training. Our
study delves into Llama2's translation capabilities. By modeling a linear
relationship between linguistic feature distances and machine translation
scores, we ask ourselves if there are potentially better central languages for
LLMs other than English. Our experiments show that the 7B Llama2 model yields
above 10 BLEU when translating into all languages it has seen, which rarely
happens for languages it has not seen. Most translation improvements into
unseen languages come from scaling up the model size rather than instruction
tuning or increasing shot count. Furthermore, our correlation analysis reveals
that syntactic similarity is not the only linguistic factor that strongly
correlates with machine translation scores. Interestingly, we discovered that
under specific circumstances, some languages (e.g. Swedish, Catalan), despite
having significantly less training data, exhibit comparable correlation levels
to English. These insights challenge the prevailing landscape of LLMs,
suggesting that models centered around languages other than English could
provide a more efficient foundation for multilingual applications.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要