Tunisian Arabic Chat Alphabet Transliteration Using Probabilistic Finite State Transducers

INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY(2019)

引用 23|浏览50
暂无评分
摘要
Internet is taking more and more scale in Tunisians life, especially after the revolution in 2011. Indeed, Tunisian Internet users are increasingly using social networks, blogs, etc. In this case, they favor Tunisian Arabic chat alphabet, which is a Latin-scripted Tunisian Arabic language. However, few tools were developed for Tunisian Arabic processing in this context. In this paper, we suggest developing a Tunisian Arabic chat alphabet-Tunisian Arabic transliteration machine based on weighted finite state transducers and using a Tunisian Arabic lexicon: aebWordNet (i.e., aeb is the ISO 639-3 code of Tunisian Arabic) and a Tunisian Arabic morphological analyzer. Weighted finite state transducers allow us to follow Tunisian Internet user's transcription behavior when writing Tunisian Arabic chat alphabet texts. This last has not a standard format but respects a regular relation. Moreover, it uses aebWordNet and a Tunisian Arabic morphological analyzer to validate the generated transliterations. Our approach attempts good results compared with existing Arabic chat alphabet-Arabic fransliteration tools such as EiKtub.
更多
查看译文
关键词
Tunisian arabic chat alphabet, tunisian arabic, fransliteration, aebWordNet, tunisian arabic morphological analyzer, weighted finite state transducer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要