Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
CVPR '14 Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition(2014)
摘要
Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.
更多查看译文
关键词
image segmentation,neural nets,object detection,R-CNN,auxiliary task,bottom-up region proposal,canonical PASCAL VOC dataset,detection algorithm,domain-specific fine-tuning,high-capacity convolutional neural network,image features,labeled training data,low-level image feature,mAP,mean average precision,object detection performance,performance boost,rich feature hierarchy,segment objects,semantic segmentation,source code,supervised pretraining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要