CLAP4CLIP: Continual Learning with Probabilistic Finetuning for Vision-Language Models
CoRR(2024)
摘要
Continual learning (CL) aims to help deep neural networks to learn new
knowledge while retaining what has been learned. Recently, pre-trained
vision-language models such as CLIP, with powerful generalization ability, have
been gaining traction as practical CL candidates. However, the domain mismatch
between the pre-training and the downstream CL tasks calls for finetuning of
the CLIP on the latter. The deterministic nature of the existing finetuning
methods makes them overlook the many possible interactions across the
modalities and deems them unsafe for high-risk CL tasks requiring reliable
uncertainty estimation. To address these, our work proposes Continual LeArning
with Probabilistic finetuning (CLAP). CLAP develops probabilistic modeling over
task-specific modules with visual-guided text features, providing more reliable
fine-tuning in CL. It further alleviates forgetting by exploiting the rich
pre-trained knowledge of CLIP for weight initialization and distribution
regularization of task-specific modules. Cooperating with the diverse range of
existing prompting methods, CLAP can surpass the predominant deterministic
finetuning approaches for CL with CLIP. Lastly, we study the superior
uncertainty estimation abilities of CLAP for novel data detection and exemplar
selection within CL setups. Our code is available at
.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要