Chrome Extension
WeChat Mini Program
Use on ChatGLM

Visual Paraphrase Generation with Key Information Retained

ACM transactions on multimedia computing, communications and applications/ACM transactions on multimedia computing communications and applications(2023)

Cited 1|Views13
No score
Abstract
Visual paraphrase generation task aims to rewrite a given image-related original sentence into a new paraphrase, where the paraphrase needs to have the same expressed meaning as the original sentence but have a difference in expression form. Existing studies mainly extract two semantic vectors to represent the entire image and the entire original sentence, respectively, for paraphrase generation. However, these semantic vectors for an image or a sentence may lead to the model failing to focus on some key objects in the original sentence, which may generate semantically inconsistent sentences by changing key object information. In this article, we propose an object-level paraphrase generation model, which generates paraphrases by adjusting the permutation of key objects and modifying their associated descriptions. To adjust the permutation of key objects, an object-sorting module aims to obtain new object sequences based on the key object information and original sentences. Then, a sequence generation module sequentially generates paraphrases based on the permutation of the newly object sequences. Each generation step focuses on different image features associated with different key objects to generate descriptions with differences. Furthermore, we use a semantic discriminator module to promote the generated paraphrase to be semantically close to the original sentence. Specifically, the loss function of the discriminator penalizes the excessive distance between the paraphrase and the original sentence. Extensive experiments on the MS COCO dataset show that the proposed model outperforms the baselines.
More
Translated text
Key words
Multimodal,visual paraphrase generation,VisualBERT
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined