UniDetector: Towards Universal Object Detection with Heterogeneous Supervision.

IEEE transactions on pattern analysis and machine intelligence(2024)

Cited 0|Views4
No score
Abstract
In this paper, we formally address universal object detection, which aims to detect every category in every scene. The dependence on human annotations, the limited visual information, and the novel categories in open world severely restrict the universality of detectors. We propose UniDetector, a universal object detector that recognizes enormous categories in the open world. The critical points for UniDetector are: 1) it leverages images of multiple sources and heterogeneous label spaces in training through image-text alignment, which guarantees sufficient information for universal representations. 2) it involves heterogeneous supervision training, which alleviates the dependence on the limited fully-labeled images. 3) it generalizes to open world easily while keeping the balance between seen and unseen classes. 4) it further promotes generalizing to novel categories through our proposed decoupling training manner and probability calibration. These contributions allow UniDetector to detect over 7k categories, the largest measurable size so far, with only about 500 classes participating in training. Our UniDetector behaves the strong zero-shot ability on large-vocabulary datasets - it surpasses supervised baselines by more than 5% without seeing any corresponding images. On 13 detection datasets with various scenes, UniDetector also achieves state-of-the-art performance with only a 3% amount of training data.
More
Translated text
Key words
Open World and Universal Object Detection,Heterogeneous Label Spaces,Heterogeneous Supervision Learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined