Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training
CoRR(2024)
摘要
We explore the training dynamics of neural networks in a structured non-IID
setting where documents are presented cyclically in a fixed, repeated sequence.
Typically, networks suffer from catastrophic interference when training on a
sequence of documents; however, we discover a curious and remarkable property
of LLMs fine-tuned sequentially in this setting: they exhibit anticipatory
behavior, recovering from the forgetting on documents before encountering them
again. The behavior emerges and becomes more robust as the architecture scales
up its number of parameters. Through comprehensive experiments and
visualizations, we uncover new insights into training over-parameterized
networks in structured environments.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要