Capturing Global and Local Information in Remote Sensing Visual Question Answering.

IEEE International Geoscience and Remote Sensing Symposium (IGARSS)(2022)

引用 1|浏览0
暂无评分
摘要
Presently, many researches on remote sensing imagery aim to gather the image content but ignore the interaction with the images. The remote sensing visual question answering system automatically gives an answer based on the content of the input image and related questions involved in the remote sensing applications. While global scenes and salient local objects are two imperative parts in most applications of remote sensing, this motivated us to design a model that can understand the scale changes from global scenes to local targets in remote sensing imagery on vision question answering. In this paper, a new model Global-Local Visual Question Answer (GLVQA) and a dataset, which includes global and local question-answer pairs of remote sensing images, are created. Furthermore, GLVQA integrates features from different scales to improve the performance on answering global or local questions. From the evaluating results on our GLVQA dataset, GLVQA achieved a validation accuracy of 83.6%, which means our method has great potential in remote sensing VQA applications.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要