Multi-level Part-aware Feature Disentangling for Text-based Person Search

2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME(2023)

引用 0|浏览13
暂无评分
摘要
Text-based person search is an important sub-task in cross-modality image retrieval, aiming to capture interested person images by giving textual descriptions. The huge information differences between image and text modalities make this task challenging. Recent methods take local-aligned feature learning strategy into consideration, but lack sufficient mining of more local information. Accordingly, we explore a Multi-level Part-aware Feature Disentangling (MPFD) framework to more fully extract visual and textual representations from multiple angles. Specifically, we introduce a Textual Part-aware Matching (TPM) module into the existing baseline, to disentangle local features for detailed information from both visual and textual part-aware aspects. Besides, in order to fuse multiple local features and improve discrimination of global features, we propose a Multi-level Feature Integration (MFI) module which is capable to perceive the relations between features. We carry out adequate experiments on CUHK-PEDES and ICFG-PEDES datasets to verify our proposed framework, and the results demonstrate that MPFD framework performs favorably against the state-of-the-art methods.
更多
查看译文
关键词
Image retrieval,Cross-modality,Representation learning,Person search
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要