DenseBert4Ret: Deep bi-modal for image retrieval

Zafran Khan, Bushra Latif,Joonmo Kim,Hong Kook Kim,Moongu Jeon

Information Sciences（2022）

Cited 3|Views4

No score

Abstract

•Bi-modal CBIR incorporating image visual features and semantic of text.•DenseNet is to generate image features in the database and the query image input by the user.•BERT is used to generate the text embeddings.•Deep learning based technique is used to obtain the joint representation of image and text modalities.•Proposed model is tested on real world dataset and ablation studies are conducted.

Translated text

Key words

Text-based/Contents-based image retrieval,Multi-modal image retrieval,DenseNet,BERT,Image,text features extraction,Joint representation of multi modal features,Deep learning,Computer vision,NLP

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined