Robust Depth Estimation Based on Parallax Attention for Aerial Scene Perception

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS(2024)

引用 0|浏览8
暂无评分
摘要
Given the precalibrated image pairs, stereo matching aims to infer the scene depth information in real-time, which has important research value in the fields of high-precision 3-D reconstruction of the Earth's surface, automatic driving and unmanned aerial vehicle (UAV) navigation. The cost volume-based stereo matching method adopts a coarse-to-fine manner to construct cascaded cost volume, and applies 3-D convolution to capture the correspondence of feature matching to infer the disparity map, which achieves comparable performance. However, the existing method has difficulty dealing with jitter regions with disparity change, and direct disparity regression easily leads to overfitting of cost volume regularization. To alleviate the above two problems, this work proposes an end-to-end disparity estimation network based on Transformer. Its specific improvements are as follows. 1) The cross-view feature interaction module based on Transformer is introduced to realize the feature interaction of global context information. 2) A parallax attention mechanism is designed to impose global geometric constraints on the epipolar line to improve the reliability of feature matching. 3) Focal loss is applied for the training of the disparity classification model to emphasize one-hot supervision in ambiguous regions. Comprehensive experiments on public datasets Sceneflow, KITTI2015, ETH3D, and aerial WHU datasets validate that the proposed work can effectively enhance the performance of disparity estimation.
更多
查看译文
关键词
Costs,Feature extraction,Estimation,Transformers,Task analysis,Convolution,Training,Disparity estimation,parallax attention,stereo matching,transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要