A maximum entropy based reordering model for Mongolian-Chinese SMT with morphological information

IALP(2014)

引用 3|浏览36
暂无评分
摘要
Different order between Mongolian and Chinese and the scarcity of parallel corpus are the main problems in Mongolian-Chinese statistical machine translation (SMT). We propose a method that adopts morphological information as the features of the maximum entropy based phrase reordering model for Mongolian-Chinese SMT. By taking advantage of the Mongolian morphological information, we add Mongolian stem and affix as phrase boundary information and use a maximum entropy model to predict reordering of neighbor blocks. To some extent, our method can alleviate the influence of reordering caused by the data sparseness. In addition, we further add part-of-speech (POS) as the features in the reordering model. Experiments show that the approach outperforms the maximum entropy model using only boundary words information and provides a maximum improvement of 0.8 BLEU score increment over baseline.
更多
查看译文
关键词
bleu score,boundary words information,morphological,data sparseness,part-of-speech,phrase reordering model,reordering prediction,morphological information,language translation,pos,phrase boundary information,mongolian-chinese smt,mongolian-chinese statistical machine translation,maximum entropy methods,parallel corpus,natural language processing,mongolian affix,maximum entropy,reordering,machine translation,mongolian stem
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要