HiGPT: Heterogeneous Graph Language Model
CoRR(2024)
Abstract
Heterogeneous graph learning aims to capture complex relationships and
diverse relational semantics among entities in a heterogeneous graph to obtain
meaningful representations for nodes and edges. Recent advancements in
heterogeneous graph neural networks (HGNNs) have achieved state-of-the-art
performance by considering relation heterogeneity and using specialized message
functions and aggregation rules. However, existing frameworks for heterogeneous
graph learning have limitations in generalizing across diverse heterogeneous
graph datasets. Most of these frameworks follow the "pre-train" and "fine-tune"
paradigm on the same dataset, which restricts their capacity to adapt to new
and unseen data. This raises the question: "Can we generalize heterogeneous
graph models to be well-adapted to diverse downstream learning tasks with
distribution shifts in both node token sets and relation type heterogeneity?”
To tackle those challenges, we propose HiGPT, a general large graph model with
Heterogeneous graph instruction-tuning paradigm. Our framework enables learning
from arbitrary heterogeneous graphs without the need for any fine-tuning
process from downstream datasets. To handle distribution shifts in
heterogeneity, we introduce an in-context heterogeneous graph tokenizer that
captures semantic relationships in different heterogeneous graphs, facilitating
model adaptation. We incorporate a large corpus of heterogeneity-aware graph
instructions into our HiGPT, enabling the model to effectively comprehend
complex relation heterogeneity and distinguish between various types of graph
tokens. Furthermore, we introduce the Mixture-of-Thought (MoT) instruction
augmentation paradigm to mitigate data scarcity by generating diverse and
informative instructions. Through comprehensive evaluations, our proposed
framework demonstrates exceptional performance in terms of generalization
performance.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined