The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models
arxiv(2024)
摘要
Prior work has found that pretrained language models (LMs) fine-tuned with
different random seeds can achieve similar in-domain performance but generalize
differently on tests of syntactic generalization. In this work, we show that,
even within a single model, we can find multiple subnetworks that perform
similarly in-domain, but generalize vastly differently. To better understand
these phenomena, we investigate if they can be understood in terms of
"competing subnetworks": the model initially represents a variety of distinct
algorithms, corresponding to different subnetworks, and generalization occurs
when it ultimately converges to one. This explanation has been used to account
for generalization in simple algorithmic tasks. Instead of finding competing
subnetworks, we find that all subnetworks – whether they generalize or not –
share a set of attention heads, which we refer to as the heuristic core.
Further analysis suggests that these attention heads emerge early in training
and compute shallow, non-generalizing features. The model learns to generalize
by incorporating additional attention heads, which depend on the outputs of the
"heuristic" heads to compute higher-level features. Overall, our results offer
a more detailed picture of the mechanisms for syntactic generalization in
pretrained LMs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要