It's Morphing Time: Unleashing the Potential of Multiple LLMs via Multi-objective Optimization
arxiv(2024)
摘要
In this paper, we introduce a novel approach for large language model merging
via black-box multi-objective optimization algorithms. The goal of model
merging is to combine multiple models, each excelling in different tasks, into
a single model that outperforms any of the individual source models. However,
model merging faces two significant challenges: First, existing methods rely
heavily on human intuition and customized strategies. Second, parameter
conflicts often arise during merging, and while methods like DARE [1] can
alleviate this issue, they tend to stochastically drop parameters, risking the
loss of important delta parameters. To address these challenges, we propose the
MM-MO method, which automates the search for optimal merging configurations
using multi-objective optimization algorithms, eliminating the need for human
intuition. During the configuration searching process, we use estimated
performance across multiple diverse tasks as optimization objectives in order
to alleviate the parameter conflicting between different source models without
losing crucial delta parameters. We conducted comparative experiments with
other mainstream model merging methods, demonstrating that our method
consistently outperforms them. Moreover, our experiments reveal that even task
types not explicitly targeted as optimization objectives show performance
improvements, indicating that our method enhances the overall potential of the
model rather than merely overfitting to specific task types. This approach
provides a significant advancement in model merging techniques, offering a
robust and plug-and-play solution for integrating diverse models into a
unified, high-performing model.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要