Chrome Extension
WeChat Mini Program
Use on ChatGLM

Multi-granularity Separation Network for Text-Based Person Retrieval with Bidirectional Refinement Regularization

ICMR '23: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval(2023)

Cited 0|Views31
No score
Abstract
Text-based person retrieval is one of the fundamental tasks in the field of computer vision, which aims to retrieve the most relevant pedestrian image from all the candidates according to textual descriptions. Such a cross-modal retrieval task could be challenging since it requires one to properly select distinguishing clues and perform cross-modal alignments. To achieve cross-modal alignments, most previous works focus on different inter-modal constraints while overlooking the influence of intra-modal noise, yielding sub-optimal retrieved results in certain cases. To this end, we propose a novel framework termed Multi-granularity Separation Network with Bidirectional Refinement Regularization (MSN-BRR) to tackle the problem. The framework consists of two components: (1) Multi-granularity Separation Network, which extracts the multi-grained discriminative textual and visual representations at local and global semantic levels. (2) Bidirectional Refinement Regularization, which alleviates the influence of intra-modal noise and facilitates the proper alignments between the visual and textual representations. Extensive experiments on two widely used benchmarks, i.e., CUHK-PEDES and ICFG-PEDES show that our MSN-BRR method outperforms current state-of-the-art methods.
More
Translated text
Key words
Refinement Regularization, Text-Based, Multi-granularity
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined