Distributional Preference Alignment of LLMs via Optimal Transport
arxiv(2024)
Abstract
Current LLM alignment techniques use pairwise human preferences at a sample
level, and as such, they do not imply an alignment on the distributional level.
We propose in this paper Alignment via Optimal Transport (AOT), a novel method
for distributional preference alignment of LLMs. AOT aligns LLMs on unpaired
preference data by making the reward distribution of the positive samples
stochastically dominant in the first order on the distribution of negative
samples. We introduce a convex relaxation of this first-order stochastic
dominance and cast it as an optimal transport problem with a smooth and convex
cost. Thanks to the one-dimensional nature of the resulting optimal transport
problem and the convexity of the cost, it has a closed-form solution via
sorting on empirical measures. We fine-tune LLMs with this AOT objective, which
enables alignment by penalizing the violation of the stochastic dominance of
the reward distribution of the positive samples on the reward distribution of
the negative samples. We analyze the sample complexity of AOT by considering
the dual of the OT problem and show that it converges at the parametric rate.
Empirically, we show on a diverse set of alignment datasets and LLMs that AOT
leads to state-of-the-art models in the 7B family of models when evaluated with
Open LLM Benchmarks and AlpacaEval.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined