Hierarchical knowledge amalgamation with dual discriminative feature alignment

Information Sciences(2022)

引用 2|浏览27
暂无评分
摘要
Heterogeneous Knowledge Amalgamation (HKA) algorithms attempt to learn a versatile and lightweight student neural network from multiple pre-trained heterogeneous teachers. They encourage the student not only to produce the same prediction as the teachers but also to imitate each teacher’s features separately in a learned Common Feature Space (CFS) by using Maximum Mean Discrepancy (MMD). However, there is no theoretical guarantee of the Out-of-Distribution robustness of teacher models in CFS, which can cause an overlap of feature representations when mapping unknown category samples. Furthermore, global alignment MMD can easily result in a negative transfer without considering class-level alignment and the relationships among all teachers. To overcome these issues, we propose a Dual Discriminative Feature Alignment (DDFA) framework, consisting of a Discriminative Centroid Clustering Strategy (DCCS) and a Joint Group Feature Alignment method (JGFA). DCCS promotes the class-separability of the teachers’ features to alleviate the overlap issue. Meanwhile, JGFA decouples the complex discrepancy among teachers and the student at both category and group levels, extending MMD to align the features discriminatively. We test our model on a list of benchmarks and demonstrate that the learned student is robust and even outperforms its teachers in most cases.
更多
查看译文
关键词
Heterogeneous knowledge amalgamation,Hierarchical common feature learning,Domain adaptation,Maximum mean discrepancy,Knowledge distillation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要