Contrastive Mean Teacher for Intra-camera Supervised Person Re-Identification

Xun Gong, Xuan Tan, Yang Xiang

IEEE Transactions on Circuits and Systems for Video Technology(2024)

Cited 0|Views2
No score
Intra-camera supervision (ICS) person reidentification (Re-ID) assumes that a person’s identity labels are independently annotated within each camera, lacking inter-camera association for person identities. Recently, several ICS methods have achieved significant results by using two stages: intra-camera learning and inter-camera learning for model training. However, in the intra-camera learning stage, these methods only focus on pedestrian features within each camera, which increases the variance of the same person across different cameras. In the inter-camera learning stage, due to lighting variations and background shifts, the generated pseudo-labels from feature similarity contain significant noise, and the unassociated outlier samples are not fully utilized. To address these issues, we propose a Contrastive Mean Teacher (CMT) framework combining Mean-teacher paradigm and contrastive learning. Specifically, by conducting both intra-camera and inter-camera learning simultaneously, we can fully leverage predefined intra-camera labels and inter-camera-associated labels. This method can effectively learn pedestrian features under various cameras. Moreover, the teacher model provides more stable predictions, which helps to establish a better inter-camera association and improves the model’s generalization capabilities. Finally, we design a background filtering module that employs attention mechanisms to guide instance normalization, further reducing variations in identity features caused by lighting and background changes. We validate our method on three large-scale person re-identification datasets, and the results show that our approach outperforms all existing ICS methods. Specifically, our approach achieves a state-of-the-art accuracy 88.9% mAP and 95.8% Rank-1 on the challenging Market1501 benchmarked with ResNet-50, even surpassing the performance of state-of-the-art fully supervised methods.
Translated text
Key words
Intra-camera supervision,Mean Teacher,Contrastive learning,Person re-identification
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined