Enhancing Document-level Translation of Large Language Model via Translation Mixed-instructions
CoRR(2024)
摘要
Existing large language models (LLMs) for machine translation are typically
fine-tuned on sentence-level translation instructions and achieve satisfactory
performance at the sentence level. However, when applied to document-level
translation, these models face a significant challenge, particularly when
dealing with documents containing over 512 tokens. This challenge arises from
the issue of sentence-level coverage, where subsequent sentences in the
document remain untranslated. As a result, the document-level translation
capability of LLMs fine-tuned on sentence-level translation instructions is
significantly limited. We conjecture that the primary cause of LLMs' weak
document-level translation performance is the absence of document-to-document
mapping ability. To address the issue, we propose an approach that combines
sentence-level and document-level translation instructions of varying lengths
to fine-tune LLMs. Our proposed translation mixed-instructions enable LLMs
(Llama-2 7B and 13B) to maintain consistent translation performance from the
sentence level to documents containing as many as 2048 tokens. Extensive
experimental results show that the proposed approach significantly enhances the
document-level translation capabilities of LLMs on 10 language pairs,
effectively mitigating the sentence-level coverage issue in document-level
translation. Experimentation on discourse phenomena has demonstrated that our
document-level translation approach significantly improves translation quality,
both in terms of BLEU score and discourse coherence.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要