Ai-assisted digitalisation of historical documents

29TH CIPA SYMPOSIUM DOCUMENTING, UNDERSTANDING, PRESERVING CULTURAL HERITAGE. HUMANITIES AND DIGITAL TECHNOLOGIES FOR SHAPING THE FUTURE, VOL. 48-M-2(2023)

引用 0|浏览2
暂无评分
摘要
Preserving historical archival heritage involves not only physical measures to safeguard these valuable texts but also providing for their digital preservation. However, merely digitising manuscripts and codexes is not enough. A further step is needed: the digitalisation of their content, i.e. the verbatim transcription of scanned texts. This process enables the accurate preservation of their textual content, making it easier to search for information and conduct further analyses. With the help of artificial intelligence, particularly Deep Neural Networks (DNNs), automatic handwriting recognition can be performed. In this study, we employed a Convolutional Recurrent Neural Network (CRNN), an established type of DNN, to determine the minimum amount of labelled data required to automatically transcribe five different historical datasets that vary in language and time period. The results show that a Character Error Rate (CER) lower than 10% can be achieved with just a few hundred labelled text lines in almost all cases.
更多
查看译文
关键词
Historical Documents,Handwriting,Digitisation,Digitalisation,Cultural Heritage,Preservation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要