Million-scale Object Detection with Large Vision Model

arxiv(2023)

引用 0|浏览84
暂无评分
摘要
Over the past few years, there has been growing interest in developing a broad, universal, and general-purpose computer vision system. Such a system would have the potential to solve a wide range of vision tasks simultaneously, without being restricted to a specific problem or data domain. This is crucial for practical, real-world computer vision applications. In this study, we focus on the million-scale multi-domain universal object detection problem, which presents several challenges, including cross-dataset category label duplication, label conflicts, and the need to handle hierarchical taxonomies. Furthermore, there is an ongoing challenge in the field to find a resource-efficient way to leverage large pre-trained vision models for million-scale cross-dataset object detection. To address these challenges, we introduce our approach to label handling, hierarchy-aware loss design, and resource-efficient model training using a pre-trained large model. Our method was ranked second in the object detection track of the Robust Vision Challenge 2022 (RVC 2022). We hope that our detailed study will serve as a useful reference and alternative approach for similar problems in the computer vision community. The code is available at https://github.com/linfeng93/Large-UniDet.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要