Improving Handwriting Recognition for Historical Documents Using Synthetic Text Lines.

IGS(2022)

引用 0|浏览0
暂无评分
摘要
Automatic handwriting recognition for historical documents is a key element for making our cultural heritage available to researchers and the general public. However, current approaches based on machine learning require a considerable amount of annotated learning samples to read ancient scripts and languages. Producing such ground truth is a laborious and time-consuming task that often requires human experts. In this paper, to cope with a limited amount of learning samples, we explore the impact of using synthetic text line images to support the training of handwriting recognition systems. For generating text lines, we consider lineGen, a recent GAN-based approach, and for handwriting recognition, we consider HTR-Flor, a state-of-the-art recognition system. Different meta-learning strategies are explored that schedule the addition of synthetic text line images to the existing real samples. In an experimental evaluation on the well-known Bentham dataset as well as the newly introduced Bullinger dataset, we demonstrate a significant improvement of the recognition performance when combining real and synthetic samples.
更多
查看译文
关键词
handwriting recognition,historical documents,text,lines
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要