RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation
arxiv(2024)
Abstract
Retrieval-augmented generation (RAG) frame work is showing state-of-the-art
performance on open-domain question answering tasks by referencing external
knowledge. However, the RAG system faces challenges with performance
degradation when it is fed contexts of low relevance or when the relative
relevance among the input contexts is inaccurately assessed. In this work, we
propose a RE-RAG framework that injects an explicit context relevance estimator
(RE) into the RAG system. RE-RAG re-evaluates the retrieved contexts with the
proposed context RE and passes the more relevant contexts along with their
measure importance to the generator. To train context RE, we propose an
unsupervised learning method, which does not utilize any labeled document
ranking data to train the context RE. To examine the efficacy of RE-RAG, we
examine its performance on Natural Questions and TriviaQA datasets. RE-RAG
achieves on-par performance compared to the FiD variants while utilizing fewer
contexts (0.25x). We show that the proposed context RE, which was trained with
the T5 model, is also applicable to RAG with LLMs(ChatGPT) by improving the
performance on NQ (+6.4EM) and TQA (+2.8EM), respecitvely. Lastly, we display
that RE can add interpretability to RAG framework as RE score highly correlates
with the RE-RAG accuracy. Consequently, RE can be utilized to filter out
unanswerable scenarios where context does not contain answers with 38.9
accuracy just by examining a set of retrieved contexts.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined