Semantic scene segmentation for indoor autonomous vision systems: leveraging an enhanced and efficient U-NET architecture

Thu A. N. Le, Nghi V. Nguyen, Nguyen T. Nguyen,Nhi Q. P. Le,Nam N. N. Nguyen, Hoang N. Tran

Multimedia Tools and Applications(2024)

引用 0|浏览12
暂无评分
摘要
Advancements in indoor autonomous vision systems (IAVSs) underscore the need to bridge the gap between their capabilities and human perception of real-world scenes. This paper introduces a novel semantic segmentation framework called EADFL-UNet, based on the U-Net architecture. It incorporates EfficientNetB3 as the encoder for improved feature extraction and employs a super attention block, integrating attention gate (AG) and spatial and channel SE (scSE) mechanisms, to refine segmentation by prioritizing relevant areas and features. Additionally, a modified loss function merging Diceloss (DL) and Class-Balanced Weights Focalloss (CBW-FL) addresses data imbalance, especially in liver segmentation and indoor environments. Evaluation of the NYUv2 Dataset and augmented datasets compared the performance of EADFL-UNet with various U-Net encoder configurations, demonstrating its superiority. Further analysis focused on integrating attention blocks at different stages of the U-Net architecture, revealing significant improvements in segmentation accuracy. The proposed method, even without depth information, outperforms conventional structures by 10
更多
查看译文
关键词
Segmentation,Attention mechanisms,Convolutional neural networks (CNN),EfficientNet
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要