Deep learning for Ethiopian Ge'ez script optical character recognision

2018 Tenth International Conference on Advanced Computational Intelligence (ICACI)(2018)

引用 3|浏览14
暂无评分
摘要
Ethiopie, or Geez is a Semitic language that has been spoken in the horn of Africa, in the history of Ethiopia, but now days it is only used as the language of liturgy in the Ethiopian and Eritrean Orthodox Tewahdo Church. Many of history, science and arts of Ethiopia have been documented by this language and almost all are found in the church. In this paper a Ge'ez Optical Character Recognition (OCR) using Convolutional Neural Network and a feed forward Multi-Layer perceptron are proposed. In order to apply these approaches two steps have done. Firstly, a Multi-Layer Perceptron neural network has developed to compare four widely used Ethiopic typefaces such as Ethiopian Jiret, Ebrima, Nyala and Abyssinica SIL. Images of 26 Ge'ez script written from these typefaces were given to the neural network for training and then scanned documents was given again as an input to check which of these typeface characters are recognized and classified. As a result, characters written in Ethiopian Jiret typeface has recognized very well. In the second part, using Ethiopian Jiret typeface many images of strings have been generated in using a standard python codec registry, then these images have used as an input to the Convolutional Neural Network and Multi-Layer Perceptron comparison. The results are promising and provide a better direction to the future work.
更多
查看译文
关键词
optical character recognition,convolutional neural network,feed forward,back propagation,Ge'ez
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要