Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

CVPR '14 Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition(2014)

引用 36337|浏览2178
暂无评分
摘要
Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.
更多
查看译文
关键词
image segmentation,neural nets,object detection,R-CNN,auxiliary task,bottom-up region proposal,canonical PASCAL VOC dataset,detection algorithm,domain-specific fine-tuning,high-capacity convolutional neural network,image features,labeled training data,low-level image feature,mAP,mean average precision,object detection performance,performance boost,rich feature hierarchy,segment objects,semantic segmentation,source code,supervised pretraining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要