Transferring Inter-Class Correlation for Teacher–Student frameworks with flexible models

Knowledge-Based Systems(2022)

引用 3|浏览14
暂无评分
摘要
The Teacher–Student (T–S) framework is widely utilized in classification tasks, through which the performance of one neural network (the student) can be improved by transferring knowledge from another trained neural network (the teacher). As the transferring knowledge is related to the network capacities and structures between the teacher and the student, how to define knowledge effectively remains an open question. To address this issue, we design a novel and flexible transferring knowledge, Self-Attention based Inter-Class Correlation (ICC) map, which reveals the correlation between every two classes in a mini-batch. Based on the ICC map, we propose a T–S framework, Inter-Class Correlation Transfer (ICCT), in which the knowledge from the teacher with a higher, equal, or lower capacity than the student can bring the benefit to the training process of the student. The ICCT can be applied flexibly on the heterogeneous network structures of the T–S pairs and exhibits excellent compatibility with existing frameworks with hidden-layers knowledge. Notably, the analysis of the ICCT demonstrates that students comprehensively learn the teacher’s knowledge in conjunction with their own understanding, rather than mimicking the teacher’s knowledge entirely. Extensive experiments are conducted in CIFAR-10, CIFAR-100, and ILSVRC2012 image classification datasets in different T–S application scenarios with different network structures. The results demonstrate that the ICCT can improve the student’s performance and outperform other state-of-the-art T–S frameworks.
更多
查看译文
关键词
Teacher–Student framework,Model distillation,Transferring knowledge,Self-Attention mechanism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要