Chrome Extension
WeChat Mini Program
Use on ChatGLM

Unsupervised Cross-Modal Retrieval by Coupled Dual Generative Adversarial Networks.

Interational Conference on Web-Age Information Management(2020)

Cited 0|Views13
No score
Abstract
Textual-visual cross-modal retrieval has become a hot research topic in both computer vision and natural language processing communities. However, existing deep cross-modal hashing methods either rely on amounts of labeled information or have no ability to learn an accuracy correlation between different modalities. In this paper, we address the unsupervised cross-modal retrieval problem using a novel framework called coupled dual generative adversarial networks (CDGAN). This framework consists of two cycle networks: a text-to-image-to-text(t2t) network and an image-to-text-to-image(i2i) network. The t2t network is used to learn the relation among an original text, the generated image and the generated text using the similarity of original and generated image-text, and the i2i network is used to learn the relation among an original image, the generated text and the generated image. Therefore, two groups of mixed hash codes of image-text are learned in this framework. Furthermore, our proposed CDGAN seamlessly couples these two cycle networks with generative adversarial mechanism so that the hash codes can be optimized simultaneously. Extensive experiments show that our framework can well match images and sentences with complex content, and it can achieve the state-of-the-art cross-modal retrieval results on two popular benchmark datasets.
More
Translated text
Key words
dual,cross-modal
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined