Automatic translation memory cleaning

Machine Translation(2017)

引用 3|浏览67
暂无评分
摘要
We address the problem of automatically cleaning a translation memory (TM) by identifying problematic translation units (TUs). In this context, we treat as “problematic TUs” those containing useless translations from the point of view of the user of a computer-assisted translation tool. We approach TM cleaning both as a supervised and as an unsupervised learning problem. In both cases, we take advantage of Translation Memory open-source purifier, an open-source TM cleaning tool also presented in this paper. The two learning paradigms are evaluated on different benchmarks extracted from MyMemory, the world’s largest public TM. Our results indicate the effectiveness of the supervised approach in the ideal condition in which labelled training data is available, and the viability of the unsupervised solution for challenging situations in which training data is not accessible.
更多
查看译文
关键词
Translation memories,Machine learning,Data cleaning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要