Syntactic Parsing Of Clause Constituents For Statistical Machine Translation

INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING(2018)

引用 1|浏览28
暂无评分
摘要
The clause is considered as the basic unit of grammar in linguistics, which is a structure between a chunk and a sentence. Clause constituents, therefore, are one important kind of linguistically valid syntactic phrases. This paper adopts the CRFs model to recognise English clause constituents with their syntactic functions, and testifies their effect on machine translation by applying this syntactic information to an English-Chinese PBSMT system, evaluated on a corpus of business domain. Clause constituents are mainly classified into six kinds: subject, predicate, complement, adjunct, residues of predicate, and residues of complement. Results show that our rich-feature CRFs model achieves an F-measure of 93.31%, a precision of 93.26%, and a recall of 93.04%. This syntactic knowledge in the source language is further combined with the NiuTrans phrasal SMT system, which slightly improves the English-Chinese translation accuracy.
更多
查看译文
关键词
syntactic parsing, clause constituents, phrase-based statistical machine translation, PBSMT
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要