Reliable Data Distillation on Graph Convolutional Network

SIGMOD/PODS '20: International Conference on Management of Data Portland OR USA June, 2020(2020)

引用 60|浏览159
暂无评分
摘要
Graph Convolutional Network (GCN) is a widely used method for learning from graph-based data. However, it fails to use the unlabeled data to its full potential, thereby hindering its ability. Given some pseudo labels of the unlabeled data, the GCN can benefit from this extra supervision. Based on Knowledge Distillation and Ensemble Learning, lots of methods use a teacher-student architecture to make better use of the unlabeled data and then make a better prediction. However, these methods introduce unnecessary training costs and a high bias of student model if the teacher's predictions are unreliable. Besides, the final ensemble gains are limited due to limited diversity in the combined models. Therefore, we propose Reliable Data Distillation, a reliable data driven semi-supervised GCN training method. By defining the node reliability and edge reliability in a graph, we can make better use of high quality data and improve the graph representation learning. Furthermore, considering the data reliability and data importance, we propose a new ensemble learning method for GCN and a novel Self-Boosting SSL Framework to combine the above optimizations. Finally, our extensive evaluation of Reliable Data Distillation on real-world datasets shows that our approach outperforms the state-of-the-art methods on semi-supervised node classification tasks.
更多
查看译文
关键词
Graph Convolutional Network, Knowledge Distillation, Ensemble Learning, Semi-Supervised Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要