Chrome Extension
WeChat Mini Program
Use on ChatGLM

Visible-Infrared Person Re-Identification via Cross-Modality Interaction Transformer

IEEE TRANSACTIONS ON MULTIMEDIA(2023)

Cited 1|Views11
No score
Abstract
Visible-infrared person re-identification (VI Re-ID) is designed to match person images of the same identity from visible and infrared cameras. Transformer structures have been successfully applied in the field of VI Re-ID. However, previous Transformer-based methods were mainly designed to capture global content information in a single modality, and could not simultaneously perceive semantic information between two modalities from a global perspective. To solve this problem, we propose a novel framework named the cross-modality interaction Transformer (CMIT). It has strong abilities in modeling spatial and sequential features that can capture dependencies between long-range features, and explicitly improves the discriminativeness of features by exchanging information across modalities, thus contributing to obtaining modality-invariant representations. Specifically, CMIT utilizes a cross-modality attention mechanism to enrich the feature representations of each patch token by interacting with the patch tokens of the other modality, and aggregates local features of the CNN structure and global information of the Transformer structure to mine feature saliency representation. Furthermore, the modality-discriminative (MD) loss function is proposed to learn potential consistency between modalities to encourage intra-modality compactness within class and inter-modality separation between classes. Extensive experiments on two benchmarks demonstrate that our approach outperforms state-of-the-art methods.
More
Translated text
Key words
Cross-modality attention mechanism,image representations,visible-infrared person re-identification,visual transformer.
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined