Which Pretrain Samples to Rehearse when Finetuning Pretrained Models?
CoRR(2024)
Abstract
Fine-tuning pretrained foundational models on specific tasks is now the de
facto approach for text and vision tasks. A known pitfall of this approach is
the forgetting of pretraining knowledge that happens during finetuning.
Rehearsing samples randomly from the pretrain dataset is a common approach to
alleviate such forgetting. However, we find that random mixing unintentionally
includes samples which are not (yet) forgotten or unlearnable by the model. We
propose a novel sampling scheme, mix-cd, that identifies and prioritizes
samples that actually face forgetting, which we call collateral damage. Since
directly identifying collateral damage samples is computationally expensive, we
propose a procedure to estimate the distribution of such samples by tracking
the statistics of finetuned samples. Our approach is lightweight, easy to
implement, and can be seamlessly integrated into existing models, offering an
effective means to retain pretrain performance without additional computational
costs.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined