Robust Arabic and Pashto Text Detection in Camera-Captured Documents Using Deep Learning Techniques

IEEE ACCESS(2023)

引用 0|浏览2
暂无评分
摘要
In the realm of Document Image Analysis (DIA), the primary objective is to transform image data into a format that can be readily interpreted by machines. Within a DIA-based system, layout analysis plays a crucial role in pre-processing, for the identification and extraction of precise and error-free textual segments. However, regarding the Pashto language, the document images are not explored so far. Pashto text detection in camera-captured documents is a challenging task due to variations in image quality, lighting conditions, complex backgrounds unavailability of labeled documents, cursiveness, shape-context dependency, multi scripts per image, and language-specific layouts. This research examines the case of Pashto and Arabic text and contributes in two aspects. First, it introduces the creation of a real dataset that contains 1080 images of the Pashto documents captured by a handheld camera. Second, this work examines deep learning based classifiers that can perform layout analysis tasks and detects Pashto and Arabic text per document. For the layout classification, we used deep learning models such as Single-Shot Detector (SSD), Yolov5 and Yolov7. A baseline results are achieved by examining 30% images as a test set and achieve a mean average precision (mAP) of 84.51% on SSD, 88.50% on Yolov5 and 91.30% on Yolov7 respectively. The proposed methods have the potential to contribute to various applications, such as document analysis, information retrieval, and translation, for Pashto and Arabic language users.
更多
查看译文
关键词
Layout,Text analysis,Electronic mail,Cameras,Deep learning,Annotations,Classification algorithms,Document image analysis,Pashto,Arabic,CNN,text detection,dataset,deep learning models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要