Hierarchical knowledge amalgamation with dual discriminative feature alignment

Renjun Xu,Shuoying Liang, Lanyu Wen, Zhitong Guo, Xinyue Huang,Mingli Song,Jindong Wang,Xiaoxiao Xu,Huajun Chen

Information Sciences（2022）

引用 2|浏览27

暂无评分

摘要

Heterogeneous Knowledge Amalgamation (HKA) algorithms attempt to learn a versatile and lightweight student neural network from multiple pre-trained heterogeneous teachers. They encourage the student not only to produce the same prediction as the teachers but also to imitate each teacher’s features separately in a learned Common Feature Space (CFS) by using Maximum Mean Discrepancy (MMD). However, there is no theoretical guarantee of the Out-of-Distribution robustness of teacher models in CFS, which can cause an overlap of feature representations when mapping unknown category samples. Furthermore, global alignment MMD can easily result in a negative transfer without considering class-level alignment and the relationships among all teachers. To overcome these issues, we propose a Dual Discriminative Feature Alignment (DDFA) framework, consisting of a Discriminative Centroid Clustering Strategy (DCCS) and a Joint Group Feature Alignment method (JGFA). DCCS promotes the class-separability of the teachers’ features to alleviate the overlap issue. Meanwhile, JGFA decouples the complex discrepancy among teachers and the student at both category and group levels, extending MMD to align the features discriminatively. We test our model on a list of benchmarks and demonstrate that the learned student is robust and even outperforms its teachers in most cases.

查看译文

关键词

Heterogeneous knowledge amalgamation,Hierarchical common feature learning,Domain adaptation,Maximum mean discrepancy,Knowledge distillation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要