MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion
arxiv(2024)
摘要
Multi-modal knowledge graphs (MMKG) store structured world knowledge
containing rich multi-modal descriptive information. To overcome their inherent
incompleteness, multi-modal knowledge graph completion (MMKGC) aims to discover
unobserved knowledge from given MMKGs, leveraging both structural information
from the triples and multi-modal information of the entities. Existing MMKGC
methods usually extract multi-modal features with pre-trained models and employ
a fusion module to integrate multi-modal features with triple prediction.
However, this often results in a coarse handling of multi-modal data,
overlooking the nuanced, fine-grained semantic details and their interactions.
To tackle this shortfall, we introduce a novel framework MyGO to process, fuse,
and augment the fine-grained modality information from MMKGs. MyGO tokenizes
multi-modal raw data as fine-grained discrete tokens and learns entity
representations with a cross-modal entity encoder. To further augment the
multi-modal representations, MyGO incorporates fine-grained contrastive
learning to highlight the specificity of the entity representations.
Experiments on standard MMKGC benchmarks reveal that our method surpasses 20 of
the latest models, underlining its superior performance. Code and data are
available at https://github.com/zjukg/MyGO
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要