Chrome Extension
WeChat Mini Program
Use on ChatGLM

MFINet: A Novel Zero-Shot Remote Sensing Scene Classification Network Based on Multimodal Feature Interaction.

Xiaomeng Tan,Bobo Xi, Haitao Xu, Yunsong Li,Changbin Xue,Jocelyn Chanussot

IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens.(2024)

Cited 0|Views1
No score
Abstract
Zero-shot classification models aim to recognize image categories that are not included in the training phase by learning seen scenes with semantic information. This approach is particularly useful in remote sensing since it can identify previously unseen classes. However, most zero-shot remote sensing scene classification approaches focus on matching visual and semantic features, while disregarding the importance of visual feature extraction, especially regarding local-global joint information. Furthermore, the visual and semantic relationships have not been thoroughly investigated due to the separate analysis of these features. To address these issues, we propose a novel zero-shot remote sensing scene classification network based on multimodal feature interaction (MFINet). Specifically, the MFINet deploys hybrid image feature extraction networks, combining convolutional neural networks and an improved Transformer, to capture local discriminant information and long-range contextual information, respectively. Notably, we design a cross-modal feature fusion (CMFF) module to facilitate the multimodal feature interaction, thereby enhancing relevant information in both the visual and semantic domains. Extensive experiments are conducted on the public zero-shot remote sensing scene dataset, and the results consistently demonstrate that our proposed MFINet outperforms the state-of-the-art methods across various seen/unseen category ratios.
More
Translated text
Key words
Zero-shot learning,remote sensing scene classification,improved Transformer,cross-modal feature fusion
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined