SDTFusion: A split-head dense transformer based network for infrared and visible image fusion

Shan Pang,Hongtao Huo,Xiaowen Liu, Bowen Zheng,Jing Li

INFRARED PHYSICS & TECHNOLOGY(2024)

引用 0|浏览7
暂无评分
摘要
Most of the current deep learning based image fusion methods heavily rely on convolutional operations for feature extraction. Recently, some Transformer -based image fusion models have emerged. However, most of them design complex attention mechanisms and still rely heavily on convolutions for local features modeling. With this goal, this paper proposes a novel and simple split -head dense Transformer based infrared and visible image fusion network, termed as SDTFusion. It consists of three parts: the feature extraction module, the inter -gating fusion module and the reconstruction module. Particularly, the feature extraction module is a pure Transformer network where an interactive split -head attention mechanism is designed to model the uni-modal and cross -modal long-range dependencies and promote cross -modal information extraction. Dense connections between Transformer blocks facilitate the reusability of feature maps. In the fusion module, the inter -gating mechanism is formulated as the element -wise product of cross -modal information, which can well retain competitive infrared brightness and distinct visible details. Moreover, a learnable detail injection module built on cross -attention mechanism injects fine-grained bi-modal information into multiple layers of the reconstruction module. Extensive experiments performed on three benchmark datasets show that SDTFusion achieves surprising fusion performance compared with nine state-of-the-art methods. In addition, the dominant capabilities of semantic segmentation and object detection also reveal the great advantage of our framework in promoting downstream visual tasks.
更多
查看译文
关键词
Infrared image,Visible image,Image fusion,Transformer,Inter-gating mechanism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要