Resolution invariant urban scene classification using Multiview learning paradigm

Digital Signal Processing(2023)

引用 0|浏览3
暂无评分
摘要
Urban scene classification is an interesting area in computer vision. The task involves classifying a scene from a pair of aerial-view and ground-view images. Existing approaches have considered single-view and Multiview methods. Multiview approaches have shown to be more robust than single-view. However, most existing Multiview approaches neglect the disparity in the resolution of both views. The aerial-view images are captured with sophisticated high-resolution remote sensing devices. While the ground-view images are captured from closer perspectives with lower resolution. This paper proposed a Multiview scene classification (MuSC) model that caters to the resolution disparity in both views. MuSC introduces a Fourier convolution network (FCN) that is robust to variation of resolution in the cross-view images. The FCN is designed to extract local features (in spatial domain) and global features (in spectral domain). The proposed MuSC has a two-stage classifier. The first stage trains a discriminative view-specific network and classifies each view separately. However, the outputs from each view-specific network are projected into a unified subspace and mutual agreement between them is incentivized through contrastive learning. The second stage integrates the predictions from each view-specific network and trains a unified classifier for final prediction. This integration encourages cross-view complementarity. MuSC is evaluated on two datasets, AiRound and CV-BrCT. Several experiments are conducted with different settings. The results demonstrate that MuSC outperforms existing state-of-the-art models.
更多
查看译文
关键词
Multiview learning, Scene classification, Aerial and ground data, Fourier convolution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要