BMFNet: Bifurcated multi-modal fusion network for RGB-D salient object detection

Chenwang Sun,Qing Zhang, Chenyu Zhuang, Mingqian Zhang

Image and Vision Computing(2024)

引用 0|浏览5
暂无评分
摘要
Although deep learning-based RGB-D salient object detection methods have achieved impressive results in the recent years, there are still some issues need to be addressed including multi-modal fusion and multi-level aggregation. In this paper, we propose a bifurcated multi-modal fusion network (BMFNet) to address these two issues cooperatively. First, we design a multi-modal feature interaction (MFI) module to fully capture the complementary information between the RGB and depth features by leveraging the channel attention and spatial attention. Second, unlike the widely used layer-by-layer progressive fusion, we adopt a bifurcated fusion strategy for all the multi-level unimodal and cross-modal features to effectively reduce the gaps between features at different levels. For the intra-group feature aggregation, a multi-modal feature fusion (MFF) module is designed to integrate the intra-group multi-modal features to produce a low-level/high-level saliency feature. For the inter-group aggregation, a multi-scale feature learning (MFL) module is introduced to exploit the contextual interactions between different scales to boost fusion performance. Experimental results on five public RGB-D datasets demonstrate the effectiveness and superiority of our proposed network. The code and prediction maps will be available at https://github.com/ZhangQing0329/BMFNet
更多
查看译文
关键词
RGB-D salient object detection,Cross-modal fusion,Multi-modal integration,Multi-level aggregation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要