Prior Knowledge-Guided Transformer for Remote Sensing Image Captioning

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING(2023)

引用 0|浏览4
暂无评分
摘要
Remote sensing image (RSI) captioning aims to generate meaningful and grammatically accurate sentences for RSIs. However, in comparison to natural image captioning, RSI captioning encounters additional challenges due to the unique characteristics of RSIs. The first challenge arises from the abundance of objects present in these images. As the number of objects increases, it becomes increasingly difficult to determine the main focus of the description. Moreover, the objects in RSIs often share similar appearances, which further complicates the generation of accurate descriptions. To overcome these challenges, we propose a prior knowledge-guided transformer (PKG-Transformer) for RSI captioning. First, scene-level and object-level features are extracted in a multilevel feature extraction (MFE) module. To further refine and enhance the extracted multilevel features, we introduce a feature enhancement (FE) module. This module utilizes a combination of graph neural networks and attention mechanisms to capture the correlation and difference between different objects or scene regions. Moreover, we propose a prior knowledge augmented attention (PKA) mechanism to select the objects that are more relevant to the scene regions by establishing the relationships between them. This attention mechanism is seamlessly integrated into the transformer structure, providing valuable prior knowledge that promotes the caption generation process. Extensive experiments on three RSI captioning datasets verify the superiority of the proposed method. Compared with the baseline methods, the proposed method achieves more impressive performance. The code will be publicly available at https://github.com/One-paper-luck/PKG-Transformer
更多
查看译文
关键词
Feature extraction,Transformers,Remote sensing,Task analysis,Iron,Decoding,Convolutional neural networks,Image captioning,prior knowledge,remote sensing,transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要