Consolidating Ranking and Relevance Predictions of Large Language Models through Post-Processing
arxiv(2024)
摘要
The powerful generative abilities of large language models (LLMs) show
potential in generating relevance labels for search applications. Previous work
has found that directly asking about relevancy, such as “How relevant is
document A to query Q?", results in sub-optimal ranking. Instead, the pairwise
ranking prompting (PRP) approach produces promising ranking performance through
asking about pairwise comparisons, e.g., “Is document A more relevant than
document B to query Q?". Thus, while LLMs are effective at their ranking
ability, this is not reflected in their relevance label generation. In this
work, we propose a post-processing method to consolidate the relevance labels
generated by an LLM with its powerful ranking abilities. Our method takes both
LLM generated relevance labels and pairwise preferences. The labels are then
altered to satisfy the pairwise preferences of the LLM, while staying as close
to the original values as possible. Our experimental results indicate that our
approach effectively balances label accuracy and ranking performance. Thereby,
our work shows it is possible to combine both the ranking and labeling
abilities of LLMs through post-processing.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要