Scale-Balanced Real-Time Object Detection With Varying Input-Image Resolution

IEEE Transactions on Circuits and Systems for Video Technology(2023)

引用 4|浏览82
暂无评分
摘要
Current object-detection methods for small-scale objects are often marred by poor performance. Using relatively high-resolution input images can be considered a remedy for this issue, but it usually leads to performance degeneration for large-scale objects. We define this problem as the imbalance of detection performance for multi-scale objects when the resolution of input images varies. In addition, the use of high-resolution images results in significant computational resource consumption and inference-speed impairment. In this paper, we propose a friendly varying-resolution object-detection method for multi-scale objects. We analyze in detail the reasons leading to the performance degradation in the detection of large-scale objects with increasing input-image resolution, and propose a novel lightweight bidirectional feature-flow module to enhance the performance of multi-scale object detection in high-resolution images, especially for large-scale objects. The proposed approach can also ease the problems of computational resource consumption and inference-speed impairment caused by high-resolution images. Additionally, a decoupled detection head is designed to further improve performance by separating classification and regression sub-tasks, and an adaptive feature-fusion module is designed to better fuse different feature levels. The proposed scheme alleviates the negative effects of using high-resolution input images and achieves an excellent balance between inference speed and precision. Experiments on the MS COCO dataset show that the scheme achieves 44.6 AP at 42.6 FPS and 47 AP at 26.7 FPS, showing significant advantages over the methods to which it is compared.
更多
查看译文
关键词
Deep convolution neural network (CNN),object detection,multi-scale features fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要