Semantic scene segmentation for indoor autonomous vision systems: leveraging an enhanced and efficient U-NET architecture
Multimedia Tools and Applications(2024)
摘要
Advancements in indoor autonomous vision systems (IAVSs) underscore the need to bridge the gap between their capabilities and human perception of real-world scenes. This paper introduces a novel semantic segmentation framework called EADFL-UNet, based on the U-Net architecture. It incorporates EfficientNetB3 as the encoder for improved feature extraction and employs a super attention block, integrating attention gate (AG) and spatial and channel SE (scSE) mechanisms, to refine segmentation by prioritizing relevant areas and features. Additionally, a modified loss function merging Diceloss (DL) and Class-Balanced Weights Focalloss (CBW-FL) addresses data imbalance, especially in liver segmentation and indoor environments. Evaluation of the NYUv2 Dataset and augmented datasets compared the performance of EADFL-UNet with various U-Net encoder configurations, demonstrating its superiority. Further analysis focused on integrating attention blocks at different stages of the U-Net architecture, revealing significant improvements in segmentation accuracy. The proposed method, even without depth information, outperforms conventional structures by 10
更多查看译文
关键词
Segmentation,Attention mechanisms,Convolutional neural networks (CNN),EfficientNet
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要