A comparison of sequential and combined approaches for named entity recognition in a corpus of handwritten medieval charters

2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR)(2020)

引用 11|浏览6
暂无评分
摘要
This paper introduces a new corpus of multilingual medieval handwritten charter images, annotated with full transcription and named entities. The corpus is used to compare two approaches for named entity recognition in historical document images in several languages: on the one hand, a sequential approach, more commonly used, that sequentially applies handwritten text recognition (HTR) and named entity recognition (NER), on the other hand, a combined approach that simultaneously transcribes the image text line and extracts the entities. Experiments conducted on the charter corpus in Latin, early new high German and old Czech for name, date and location recognition demonstrate a superior performance of the combined approach.
更多
查看译文
关键词
Named entity recognition,Handwritten Text Recognition,historical document processing,multilingualism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要