Neural Machine Translation for Sinhala-English Code-Mixed Text.

RANLP(2021)

引用 2|浏览1
暂无评分
摘要
Code-mixing has become a moving method of communication among multilingual speakers.Most of the social media content of the multilingual societies are written in code-mixed text.However, most of the current translation systems neglect to convert code-mixed texts to a standard language.Most of the user written code-mixed content in social media remains unprocessed due to the unavailability of linguistic resource such as parallel corpus.This paper proposes a Neural Machine Translation(NMT) model to translate the Sinhala-English code-mixed text to the Sinhala language.Due to the limited resources available for Sinhala-English code-mixed(SECM) text, a parallel corpus is created with SECM sentences and Sinhala sentences.Srilankan social media sites contain SECM texts more frequently than the standard languages.The model proposed for code-mixed text translation in this study is a combination of Encoder-Decoder framework with LSTM units and Teachers Forcing Algorithm.The translated sentences from the model are evaluated using BLEU(Bilingual Evaluation Understudy) metric.Our model achieved a remarkable BLEU score for the translation.
更多
查看译文
关键词
translation,sinhala-english,code-mixed
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要