BiLoRA: A Bi-level Optimization Framework for Overfitting-Resilient Low-Rank Adaptation of Large Pre-trained Models
arxiv(2024)
摘要
Low-rank adaptation (LoRA) is a popular method for fine-tuning large-scale
pre-trained models in downstream tasks by learning low-rank incremental
matrices. Though LoRA and its variants effectively reduce the number of
trainable parameters compared to full fine-tuning methods, they often overfit
training data, resulting in sub-optimal generalization on test data. To address
this problem, we introduce BiLoRA, an overfitting-alleviating fine-tuning
approach based on bi-level optimization (BLO). BiLoRA employs pseudo singular
value decomposition to parameterize low-rank incremental matrices and splits
the training of pseudo singular vectors and values across two different subsets
of training data. This division, embedded within separate levels of the BLO
framework, mitigates the risk of overfitting to a single dataset. Tested on ten
datasets covering natural language understanding and generation tasks and
applied to various well-known large pre-trained models, BiLoRA significantly
outperforms LoRA methods and other fine-tuning approaches, with similar amounts
of trainable parameters.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要