Ambiguity-aware robust teacher (ART): Enhance d self-knowle dge distillation framework with pruned teacher network

PATTERN RECOGNITION(2023)

引用 2|浏览11
暂无评分
摘要
Self-knowledge distillation (self-KD) methods, which use a student model itself as the teacher model instead of a large and complex teacher model, are currently a subject of active study. Since most previ-ous self-KD approaches relied on the knowledge of a single teacher model, if the teacher model incor-rectly predicted confusing samples, poor-quality knowledge was transferred to the student model. Unfor-tunately, natural images are often ambiguous for teacher models due to multiple objects, mislabeling, or low quality. In this paper, we propose a novel knowledge distillation framework named ambiguity-aware robust teacher knowledge distillation (ART-KD) that provides refined knowledge, that reflects the ambigu-ity of the samples with network pruning. Since the pruned teacher model is simply obtained by copying and pruning the teacher model, re-training process is unnecessary in ART-KD. The key insight of ART-KD lies in the predictions of a teacher model and pruned teacher model for ambiguous samples providing different distributions with low similarity. From these two distributions, we obtain a joint distribution considering the ambiguity of the samples as teacher's knowledge for distillation. We comprehensively evaluate our method on public classification benchmarks, as well as more challenging benchmarks for fine-grained visual recognition (FGVR), achieving much superior performance to state-of-the-art counter-parts.(c) 2023 Elsevier Ltd. All rights reserved.
更多
查看译文
关键词
teacher network,ambiguity-aware,self-knowledge
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要