HeightFormer: A Multilevel Interaction and Image-Adaptive Classification-Regression Network for Monocular Height Estimation with Aerial Images

Zhan Chen,Yidan Zhang,Xiyu Qi,Yongqiang Mao, Xin Zhou,Lei Wang, Yunping Ge

REMOTE SENSING(2024)

引用 0|浏览2
暂无评分
摘要
Height estimation has long been a pivotal topic within measurement and remote sensing disciplines, with monocular height estimation offering wide-ranging data sources and convenient deployment. This paper addresses the existing challenges in monocular height estimation methods, namely the difficulty in simultaneously achieving high-quality instance-level height and edge reconstruction, along with high computational complexity. This paper presents a comprehensive solution for monocular height estimation in remote sensing, termed HeightFormer, combining multilevel interactions and image-adaptive classification-regression. It features the Multilevel Interaction Backbone (MIB) and Image-adaptive Classification-regression Height Generator (ICG). MIB supplements the fixed sample grid in the CNN of the conventional backbone network with tokens of different interaction ranges. It is complemented by a pixel-, patch-, and feature map-level hierarchical interaction mechanism, designed to relay spatial geometry information across different scales and introducing a global receptive field to enhance the quality of instance-level height estimation. The ICG dynamically generates height partition for each image and reframes the traditional regression task, using a refinement from coarse to fine classification-regression that significantly mitigates the innate ill-posedness issue and drastically improves edge sharpness. Finally, the study conducts experimental validations on the Vaihingen and Potsdam datasets, with results demonstrating that our proposed method surpasses existing techniques.
更多
查看译文
关键词
monocular height estimation,multilevel interaction,local attention
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要