Scene text recognition in multiple frames based on text tracking

ICME(2014)

引用 24|浏览55
暂无评分
摘要
Text signage as visual indicators in natural scene plays an important role in navigation and notification in our daily life. Most previous methods of scene text extraction are developed from a single scene image. In this paper, we propose a multi-frame based scene text recognition method by tracking text regions in a video captured by a moving camera. The main contributions of this paper are as follows. First, we present a framework of scene text recognition in multiple frames based on feature representation of scene text character (STC) for character prediction and conditional random field (CRF) model for word configuration. Second, a feature representation of STC is employed from dense sampled SIFT descriptors and Fisher Vector. Third, we collect a dataset for text information extraction from natural scene videos. Our proposed multi-frame scene text recognition is more compatible with image/video-based mobile applications. The experimental results demonstrate that STC prediction and word configuration in multiple frames based on text tracking significantly improves the performance of scene text recognition.
更多
查看译文
关键词
feature representation,text information extraction,scene text character,tracking text regions,text tracking,image-based mobile applications,natural scene videos,multiframe based scene text recognition method,multiframe scene text recognition,multiple frames,video dataset of scene text,text detection,scene text recognition,conditional random field model,feature representation of scene text character,video-based mobile applications,text signage,fisher vector,text extraction,sift descriptors,accuracy,feature extraction,predictive models,vectors,encoding,trajectory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要