Recognition of Off-line Handwritten Uyghur Words Using Bayesian Networks with Grapheme Nodes

SN Comput. Sci.(2020)

引用 0|浏览0
暂无评分
摘要
This study proposes a new algorithm that constructs a word Bayesian network (BN) framework with grapheme nodes to recognize off-line handwritten Uyghur words. First, we build an Uyghur script grapheme library according to the rules and morphological structure of Uyghur. The library includes main grapheme, affix grapheme, and dot grapheme categories. Second, word images are segmented into grapheme sequences by subjecting the individual strokes to extraction, segmentation, and clustering operations. Then we design specific feature extractors and classifiers for specific graphemes to detect and identify small differences between similar words. Finally, we construct a hierarchical matching model for graphemes, conjoined segments, and words using a discrete BN. The BN infers word categories from grapheme features, calculates the confidence of inference, and integrates the grapheme recognition information and word-formation prior information to obtain the final word recognition results. A word recognition rate of 91.65% is obtained during experiments conducted with a database consisting of 12,500 samples and a total of 58 trained grapheme categories. These results indicate that the proposed algorithm not only provides a high word recognition rate by effectively avoiding character over-segmentation errors, but also employs a small and fully predictable number of training categories, which facilitates strong expansibility.
更多
查看译文
关键词
Computer application,Text recognition,Uyghur language,Off-line handwritten words,Bayesian network,Grapheme
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要