Automatic recognition system for document digitization in nuclear power plants

Nuclear Engineering and Design(2022)

引用 0|浏览19
暂无评分
摘要
With the increasing number of data-driven models in nuclear applications, large volumes of numerical data are requiblack to accurately model and pblackict the health status of a plant component. However, many historical operation logs that contain useful information are not fully utilized due to the lack of a systematic approach of digitization. To overcome this issue, this study proposes an automatic pipeline for extracting information from handwritten tabular documents collected from nuclear power plants. In our pipeline, we first denoise scanned documents with morphological operations, and then extract relevant parts from individual pages using both traditional computer vision and neural network methods. Handwriting recognition is applied to obtain text and numbers. As the most challenging step is how to crop only relevant information, the main focus of our paper is to detect tables and cells from scanned handwritten documents. We evaluate the efficiency and accuracy of our proposed method on handwritten operational reports obtained from a real-world case study. The results demonstrate the high accuracy and practicality of our proposed method.
更多
查看译文
关键词
Nuclear power plants,Computer vision,Image processing,Neural networks,Table detection,Handwritten documents
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要