Self-Retrieval: Building an Information Retrieval System with One Large Language Model
arxiv(2024)
摘要
The rise of large language models (LLMs) has transformed the role of
information retrieval (IR) systems in the way to humans accessing information.
Due to the isolated architecture and the limited interaction, existing IR
systems are unable to fully accommodate the shift from directly providing
information to humans to indirectly serving large language models. In this
paper, we propose Self-Retrieval, an end-to-end, LLM-driven information
retrieval architecture that can fully internalize the required abilities of IR
systems into a single LLM and deeply leverage the capabilities of LLMs during
IR process. Specifically, Self-retrieval internalizes the corpus to retrieve
into a LLM via a natural language indexing architecture. Then the entire
retrieval process is redefined as a procedure of document generation and
self-assessment, which can be end-to-end executed using a single large language
model. Experimental results demonstrate that Self-Retrieval not only
significantly outperforms previous retrieval approaches by a large margin, but
also can significantly boost the performance of LLM-driven downstream
applications like retrieval augumented generation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要