Exploring Representational Disparities Between Multilingual and Bilingual Translation Models
arxiv(2023)
摘要
Multilingual machine translation has proven immensely useful for both
parameter efficiency and overall performance across many language pairs via
complete multilingual parameter sharing. However, some language pairs in
multilingual models can see worse performance than in bilingual models,
especially in the one-to-many translation setting. Motivated by their empirical
differences, we examine the geometric differences in representations from
bilingual models versus those from one-to-many multilingual models.
Specifically, we compute the isotropy of these representations using intrinsic
dimensionality and IsoScore, in order to measure how the representations
utilize the dimensions in their underlying vector space. Using the same
evaluation data in both models, we find that for a given language pair, its
multilingual model decoder representations are consistently less isotropic and
occupy fewer dimensions than comparable bilingual model decoder
representations. Additionally, we show that much of the anisotropy in
multilingual decoder representations can be attributed to modeling
language-specific information, therefore limiting remaining representational
capacity.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要