Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism
arxiv(2023)
Abstract
Large language models (LLMs) have demonstrated impressive language
understanding and generation capabilities, enabling them to answer a wide range
of questions across various domains. However, these models are not flawless and
often produce responses that contain errors or misinformation. These
inaccuracies, commonly referred to as hallucinations, render LLMs unreliable
and even unusable in many scenarios. In this paper, our focus is on mitigating
the issue of hallucination in LLMs, particularly in the context of
question-answering. Instead of attempting to answer all questions, we explore a
refusal mechanism that instructs LLMs to refuse to answer challenging questions
in order to avoid errors. We then propose a simple yet effective solution
called Learn to Refuse (L2R), which incorporates the refusal mechanism to
enable LLMs to recognize and refuse to answer questions that they find
difficult to address. To achieve this, we utilize a structured knowledge base
to represent all the LLM's understanding of the world, enabling it to provide
traceable gold knowledge. This knowledge base is separate from the LLM and
initially empty. It can be filled with validated knowledge and progressively
expanded. When an LLM encounters questions outside its domain, the system
recognizes its knowledge scope and determines whether it can answer the
question independently. Additionally, we introduce a method for automatically
and efficiently expanding the knowledge base of LLMs. Through qualitative and
quantitative analysis, we demonstrate that our approach enhances the
controllability and reliability of LLMs.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined