A Multimodal Fusion Scene Graph Generation Method Based on Semantic Description

2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS)(2022)

Cited 0|Views8
No score
Abstract
For the scene graph generation task, a multimodal fusion scene graph generation method based on semantic description is proposed considering the problems of long-tail distribution and low frequency of high-level semantic interactions in the dataset. Firstly, target detection and relationship inference are performed on the image to construct an image scene graph. Second, the semantic descriptions are transformed into semantic graphs, which are fed into a pre-trained scene graph parser to construct semantic scene graphs. Finally, the two scene graphs are aligned for display and the information of nodes and edges are updated to obtain a fused scene graph with more comprehensive coverage and more accurate semantic interaction information.
More
Translated text
Key words
Scene graph,Semantic description,Multimodal fusion
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined