Optical Character Recognition for Audio-Visual Broadcast Transcription System

2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)(2020)

Cited 2|Views3
No score
Abstract
This paper investigates the use of optical character recognition (OCR) for system of audio-visual broadcast transcription. Characters were recognized from video frames by open-source program OCR Tesseract. The OCR in this program (from version 4) is based on Recurrent Neural Networks (RNN) and it uses text post-processing by bigram language model. However, the resulting recognized text contains a number of errors. In some images, the text is not detected and recognized correctly or it is not detected at all. We have designed and tested image pre-processing and text post-processing methods for OCR error reduction. The word error rate (WER) was reduced from 29,4% to 15,4%.
More
Translated text
Key words
OCR,Broadcast Transcription,Text Post-processing
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined