A multimodal fusion framework for urban scene understanding and functional identification using geospatial data

Chen Su,Xinli Hu,Qingyan Meng,Linlin Zhang,Wenxu Shi,Maofan Zhao

INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION（2024）

引用 0|浏览1

暂无评分

摘要

Urban scene understanding and functional identification are essential for accurately characterizing the spatial structure and optimizing the city layouts during rapid urbanization. Multimodal data is important for recognizing the distribution patterns of urban functions and revealing internal details. Previous studies have focused primarily on remote sensing imagery and points of interest (POIs) data, overlooking the role of building characteristics in determining functions of urban scenes. These studies are also limited in terms of mining and fusing multimodal features. To address these challenges, this study proposes a multimodal fusion framework that integrates remote sensing imagery, POIs, and building footprints for urban scene understanding and functional mapping. The framework employs a dual-branch model that extracts visual semantic features from the remote sensing imagery and socioeconomic features from auxiliary data, such as POIs and building footprints. A branch attention module is designed to assign weights to dual-branch features. Additionally, a multiscale feature fusion module is introduced to extract and combine multiscale features through modal interaction. Experiments in Beijing and Chengdu validate the effectiveness of the proposed framework with overall accuracy of 90.04% and 92.07%, and kappa coefficient of 0.881 and 0.895, respectively. This study provides empirical evidence to support accurate urban planning and further promote urban sustainable development. The source code is at: htt ps://github.com/sssuchen/MMFF.

查看译文

关键词

Urban scene understanding,Urban function,Multimodal data,Remote sensing

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要