Weakly Supervised Text Attention Network for Generating Text Proposals in Scene Images

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)(2017)

引用 7|浏览7
暂无评分
摘要
Detection and recognition of textual information in scene images is useful but challenging tasks. Numerous methods have been proposed to solve the problem. Recently the best results are attained by deep neural network based methods. Training such networks needs large amounts of bounding box-level or pixel-level annotated data. Generating large amounts of such data always requires huge amounts of labor which can be expensive and time consuming. In this paper we explore the utilization of weakly supervised deep neural network for generating text proposals in natural scene images. The network allows multi-scale inputs and is trained to perform whole image binary classification to tell whether an image contains text or not. After training the network acquired learning of powerful discriminated features that are capable of distinguishing text from other objects. To get the text location, text confidence score map is generated based on feature maps from the top two convolutional layers by extracting class activation map. Value of each pixel in the score map denotes the confidence score of whether the pixel belongs to text or not. By setting a threshold the score map is converted to a binary mask map. Foregrounds of the mask map are probable text areas. Then Maximally Stable Extremal Regions (MSERs) are extracted from these probable text areas and are aggregated as groups. By processing these groups, text proposals are obtained. Experimental results show that without using any bounding boxes or pixel-level annotation, the algorithm achieves recall rate comparable to some fully supervised methods in ICDAR 2013 focused text dataset and In ICDAR 2015 incidental text dataset.
更多
查看译文
关键词
scene text,weak supervision,convolutional neural network,object proposals,spatial pyramid pooling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要