Context-aware relation enhancement and similarity reasoning for image-text retrieval

Zheng Cui,Yongli Hu,Yanfeng Sun,Baocai Yin

IET COMPUTER VISION（2024）

引用 0|浏览9

暂无评分

摘要

Image-text retrieval is a fundamental yet challenging task, which aims to bridge a semantic gap between heterogeneous data to achieve precise measurements of semantic similarity. The technique of fine-grained alignment between cross-modal features plays a key role in various successful methods that have been proposed. Nevertheless, existing methods cannot effectively utilise intra-modal information to enhance feature representation and lack powerful similarity reasoning to get a precise similarity score. Intending to tackle these issues, a context-aware Relation Enhancement and Similarity Reasoning model, called RESR, is proposed, which conducts both intra-modal relation enhancement and inter-modal similarity reasoning while considering the global-context information. For intra-modal relation enhancement, a novel context-aware graph convolutional network is introduced to enhance local feature representations by utilising relation and global-context information. For inter-modal similarity reasoning, local and global similarity features are exploited by the bidirectional alignment of image and text, and the similarity reasoning is implemented among multi-granularity similarity features. Finally, refined local and global similarity features are adaptively fused to get a precise similarity score. The experimental results show that our effective model outperforms some state-of-the-art approaches, achieving average improvements of 2.5% and 6.3% in R@sum on the Flickr30K and MS-COCO dataset. A novel context-aware relation enhancement and similarity reasoning model is proposed to achieve precise image-text retrieval, which conducts both intra-modal relation enhancement and inter-modal similarity reasoning while considering the global-context information. image

查看译文

关键词

image retrieval,multimedia systems

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要