Improving RGB-infrared object detection with cascade alignment-guided transformer

Maoxun Yuan, Xiaorong Shi, Nan Wang, Yinyan Wang,Xingxing Wei

INFORMATION FUSION(2024)

Cited 0|Views8
No score
Abstract
The integration of multispectral data in object detection, especially visible and infrared images, has been the subject of considerable attention recently. Complementary information from visible (RGB) and infrared (IR) images can ameliorate the challenges posed by variable lighting conditions, rendering them an invaluable resource in many fields, including RGB-IR object detection, RGB-IR semantic segmentation, and RGB-IR crowd counting. However, existing methods still suffer from weak misalignment and fusion imprecision problems. These two problems present significant challenges for accurate object detection. In this paper, our primary focus is to solve the above problems in RGB-IR object detection tasks. Specifically, we first propose a TranslationScale -Rotation Alignment (TSRA) module to align two modality features from region proposals. Based on the aligned region features, we introduce a Complementary Fusion Transformer (CFT) module to capture the complementary features. These two modules can be coupled in an unified Region of Interest (RoI) detection head called Cascade Alignment -Guided Transformer (CAGT) to obtain the robust fused features. Finally, based on CAGT, a region feature alignment and fusion detector called CAGTDet is constructed for RGB-IR object detection. Through comprehensive experiments on the aerial DroneVehicle dataset, our method effectively mitigates the impact of these two issues, resulting in robust detection results. Moreover, to evaluate the generalization of our method, we also perform experiments on the nature images sampled from the KAIST multispectral pedestrian dataset. The results show that our method surpasses other state-of-the-art methods.
More
Translated text
Key words
RGBT object detection,Multi-modal learning,Cross-modal alignment,RGBT pedestrian detection
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined