Learning or Self-aligning? Rethinking Instruction Fine-tuning
CoRR(2024)
摘要
Instruction Fine-tuning (IFT) is a critical phase in building large language
models (LLMs). Previous works mainly focus on the IFT's role in the transfer of
behavioral norms and the learning of additional world knowledge. However, the
understanding of the underlying mechanisms of IFT remains significantly
limited. In this paper, we design a knowledge intervention framework to
decouple the potential underlying factors of IFT, thereby enabling individual
analysis of different factors. Surprisingly, our experiments reveal that
attempting to learn additional world knowledge through IFT often struggles to
yield positive impacts and can even lead to markedly negative effects. Further,
we discover that maintaining internal knowledge consistency before and after
IFT is a critical factor for achieving successful IFT. Our findings reveal the
underlying mechanisms of IFT and provide robust support for some very recent
and potential future works.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要