Up to Thousands-fold Storage Saving: Towards Efficient Data-Free Distillation of Large-Scale Visual Classifiers

MM '23: Proceedings of the 31st ACM International Conference on Multimedia(2023)

引用 0|浏览2
暂无评分
摘要
Data-Free Knowledge Distillation (DFKD) has started to make breakthroughs in classification tasks for large-scale datasets such as ImageNet-1k. Despite the encouraging results achieved, these modern DFKD methods still suffer from the massive waste of system storage and I/O resources. They either synthesize and store a vast amount of pseudo data or build thousands of generators. In this work, we introduce a storage-efficient scheme called Class-Expanding DFKD (CE-DFKD). It allows us to reduce storage costs by orders of magnitude in large-scale tasks using just one or a few generators without explicitly storing any data. The key to the success of our approach lies in alleviating the mode collapse of the generator by expanding its collapse range. Specifically, we first investigate and address the optimization conflict of previous single-generator-based DFKD methods by introducing conditional constraints. Then, we propose two class-expanding strategies to enrich the conditional information of the generator from both inter-class and intra-class perspectives. With the diversity of generated samples significantly enhanced, the proposed CE-DFKD outperforms existing methods by a large margin while achieving up to thousands of times storage savings. Besides the ImageNet-1k, the proposed CE-DFKD is compatible with widely used small-scale datasets and can be scaled to the more complex ImageNet-21k-P dataset, which was previously unreported in prior DFKD methods.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要