Chrome Extension
WeChat Mini Program
Use on ChatGLM

Cross-Modal Feature Fusion and Interaction Strategy for CNN-Transformer-Based Object Detection in Visual and Infrared Remote Sensing Imagery

Jinyan Nie, He Sun, Xu Sun, Li Ni, Lianru Gao

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS(2024)

Cited 0|Views8
No score
Abstract
Due to the complementarity of visible and infrared images, it has become more favorable to fuse these two modalities to improve the object detection accuracy in the remote sensing area. However, there are still some problems to be solved. Most of the existing algorithms focus too much on the local information and ignore long-range information when performing feature extraction on different modalities. Besides, coarse weighted fusion strategies do not fully utilize the information from different modalities, and the fusion structure ignores the importance of intermodal information exchange. To tackle these problems, a cross-modal feature fusion and interaction strategy for the convolutional neural network (CNN)-transformer-based object detection in visual and infrared remote sensing imagery is proposed. We adopt a parallel structure to extract the features of different modalities, separately. In visual and infrared modality, the convolutional layers and transformer encoders are cascaded to fully extract both local and long-range information. The cross-modal feature fusion and interaction module (CFFIM) adopts the attention mechanisms to jointly fuse different modal features at the same scale to improve the diversity of fused features, and the feature interaction enables the sharing of visible and infrared information. Experiments on the VEDAI dataset have demonstrated the effectiveness of the proposed scheme compared to other state-of-the-art algorithms.
More
Translated text
Key words
Feature extraction,Object detection,Transformers,Visualization,Convolution,Sun,Remote sensing,Feature fusion,object detection,vision transformer,visual and infrared remote sensing imagery
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined