Automatic back transliteration of Romanized Bengali (Banglish) to Bengali

Iran J. Comput. Sci.(2022)

引用 0|浏览2
暂无评分
摘要
Back transliteration of Romanized Bengali to Bengali is the process of converting text written in the Latin alphabet back into the Bengali script. This is often done in order to improve the readability of Bengali text for Bengali speakers using a simple rules-based system, or an interactive transliteration tool. There are many ways to back transliterate from Romanized Bengali to Bengali, but most of them are either grapheme or phoneme based. This paper introduces a unique pipeline that uses nine open source back transliteration tools to automatically back transliterate Romanized Bengali to Bengali. The pipeline consists of seven steps: (1) processing the Romanized Bengali input; (2) acquiring human transliteration for performance comparison; (3) employing transliteration tools; (4) generating candidate transliterations; (5) post-processing the candidate transliterations; (6) selecting best candidate transliteration, and (7) evaluating the quality of the transliterations through several performance metrics. Experimental results reveal that our approach produced the highest BLEU-1 score of 81.28, BLEU-2 score of 60.75, BLEU-3 score of 44.45, BLEU-4 score of 30.46, and the lowest average Word Error Rate and Word Information Lost of 29.21 and 43.68, respectively, on 1000 Romanized Bengali texts. In terms of recall, we achieved a Rouge-L score of 0.7190.
更多
查看译文
关键词
Transliteration,Back transliteration,Romanized Bengali,Banglish,GPT-3,Google translate,TextRank,Bengali phonetic parser
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要