Localization of Pashto Text in the Video Frames Using Deep Learning

ADVANCES IN CYBERSECURITY, CYBERCRIMES, AND SMART EMERGING TECHNOLOGIES(2023)

引用 0|浏览4
暂无评分
摘要
Object detection has remained an attractive and challenging task for the computer vision research community. A along with other objects, researchers tried to detect the texts in the images and videos as well. Earlier, the handcrafted features were used to detect text in the images and videos. These features have low discriminative power, leading to poor performance of the underlying machine learning model. Furthermore, more features are added to boost the discriminative ability of features, resulting in large data dimensionality. When dimensionality is increased, the performance of conventional machine learning usually falls. Deep learning can learn a feature by itself, which is known as representation learning, but it also performs better on high-dimensional data due to its data-hungry nature. Deep Neural network is an end-to-end system that is fully automated and does not require any handcrafting. Earlier, Arabic and Urdu and few other languages were detected in videos, but they mostly used handcrafted features to localize text in videos which shows the low performance on high dimensional data. Pashto language being the superset of Arabic, Urdu, and Persian, was remained unattended by the researchers. The contribution of this work is two folded: (i) dataset generation and annotation (ii) using a deep learning model for the Pashto text localization. Since it is pioneering work on Pashto text location, that is why comparison with the state of the art is not conducted. We obtained good results with IOU more than 80% and recall is 0.98.
更多
查看译文
关键词
Localization,Pashto,Deep learning,YOLO,Darknet
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要