Chrome Extension
WeChat Mini Program
Use on ChatGLM

EISNet: A Multi-Modal Fusion Network for Semantic Segmentation with Events and Images

IEEE Transactions on Multimedia(2024)

Cited 0|Views33
No score
Abstract
Bio-inspired event cameras record a scene as sparse and asynchronous “events” by detecting per-pixel brightness changes. Such cameras show great potential in challenging scene understanding tasks, benefiting from the imaging advantages of high dynamic range and high temporal resolution. Considering the complementarity between event and standard cameras, we propose a multi-modal fusion network (EISNet) to improve the semantic segmentation performance. The key challenges of this topic lie in ( i ) how to encode event data to represent accurate scene information and ( ii ) how to fuse multi-modal complementary features by considering the characteristics of two modalities. To solve the first challenge, we propose an Activity-Aware Event Integration Module (AEIM) to convert event data into frame-based representations with high-confidence details via scene activity modeling. To tackle the second challenge, we introduce the Modality Recalibration and Fusion Module (MRFM) to recalibrate modal-specific representations and then aggregate multi-modal features at multiple stages. MRFM learns to generate modal-oriented masks to guide the merging of complementary features, achieving adaptive fusion. Based on these two core designs, our proposed EISNet adopts an encoder-decoder transformer architecture for accurate semantic segmentation using events and images. Experimental results show that our model outperforms state-of-the-art methods by a large margin on event-based semantic segmentation datasets. The code is publicly available at https://github.com/bochenxie/EISNet .
More
Translated text
Key words
Event camera,multi-modal fusion,attention mechanism,semantic segmentation
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined