Ambiguity-aware robust teacher (ART): Enhance d self-knowle dge distillation framework with pruned teacher network

Yucheol Cho,Gyeongdo Ham,Jae-Hyeok Lee,Daeshik Kim

PATTERN RECOGNITION（2023）

引用 2|浏览11

暂无评分

摘要

Self-knowledge distillation (self-KD) methods, which use a student model itself as the teacher model instead of a large and complex teacher model, are currently a subject of active study. Since most previ-ous self-KD approaches relied on the knowledge of a single teacher model, if the teacher model incor-rectly predicted confusing samples, poor-quality knowledge was transferred to the student model. Unfor-tunately, natural images are often ambiguous for teacher models due to multiple objects, mislabeling, or low quality. In this paper, we propose a novel knowledge distillation framework named ambiguity-aware robust teacher knowledge distillation (ART-KD) that provides refined knowledge, that reflects the ambigu-ity of the samples with network pruning. Since the pruned teacher model is simply obtained by copying and pruning the teacher model, re-training process is unnecessary in ART-KD. The key insight of ART-KD lies in the predictions of a teacher model and pruned teacher model for ambiguous samples providing different distributions with low similarity. From these two distributions, we obtain a joint distribution considering the ambiguity of the samples as teacher's knowledge for distillation. We comprehensively evaluate our method on public classification benchmarks, as well as more challenging benchmarks for fine-grained visual recognition (FGVR), achieving much superior performance to state-of-the-art counter-parts.(c) 2023 Elsevier Ltd. All rights reserved.

查看译文

关键词

teacher network,ambiguity-aware,self-knowledge

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要