Disentangling and Mitigating the Impact of Task Similarity for Continual Learning
CoRR(2024)
摘要
Continual learning of partially similar tasks poses a challenge for
artificial neural networks, as task similarity presents both an opportunity for
knowledge transfer and a risk of interference and catastrophic forgetting.
However, it remains unclear how task similarity in input features and readout
patterns influences knowledge transfer and forgetting, as well as how they
interact with common algorithms for continual learning. Here, we develop a
linear teacher-student model with latent structure and show analytically that
high input feature similarity coupled with low readout similarity is
catastrophic for both knowledge transfer and retention. Conversely, the
opposite scenario is relatively benign. Our analysis further reveals that
task-dependent activity gating improves knowledge retention at the expense of
transfer, while task-dependent plasticity gating does not affect either
retention or transfer performance at the over-parameterized limit. In contrast,
weight regularization based on the Fisher information metric significantly
improves retention, regardless of task similarity, without compromising
transfer performance. Nevertheless, its diagonal approximation and
regularization in the Euclidean space are much less robust against task
similarity. We demonstrate consistent results in a permuted MNIST task with
latent variables. Overall, this work provides insights into when continual
learning is difficult and how to mitigate it.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要