Cross-lingual Transfer from Large Multilingual Translation Models to Unseen Under-resourced Languages

BALTIC JOURNAL OF MODERN COMPUTING(2022)

引用 1|浏览0
暂无评分
摘要
Low-resource machine translation has been a challenging problem to solve, with the lack of data being a big obstacle in producing good quality neural machine translation (NMT) sys-tems. However, recent work on developing large multilingual translation models gives a platform for attempts to create NMT systems for extremely low-resource languages that can achieve rea-sonable and usable quality. We leverage the information in large multilingual translation models by performing cross-lingual transfer learning to extremely low-resource Finno-Ugric languages. Our experiments include seven languages with limited resources that are unseen by the original pre-trained translation model and five high-resource languages that have the potential to help during training, previously seen by the model during training. We report state-of-the-art results on multiple test sets and translation directions as well as analyze the low-resource languages in smaller language groups in order to track the source of our higher translation quality.
更多
查看译文
关键词
multilingual, cross -lingual transfer learning, low -resource, Finno-Ugric languages
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要