Preference Optimization with Multi-Sample Comparisons.Chaoqi Wang,Zhuokai Zhao,Chen Zhu,Karthik Abinav Sankararaman,Michal Valko, Xuefei Cao,Zhaorun Chen,Madian Khabsa,Yuxin Chen,Hao Ma,Sinong WangCoRR(2024)引用 0|浏览3AI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要