SelectLLM: Can LLMs Select Important Instructions to Annotate?
CoRR(2024)
摘要
Training large language models (LLMs) with a large and diverse instruction
dataset aligns the models to comprehend and follow human instructions. Recent
works have shown that using a small set of high-quality instructions can
outperform using large yet more noisy ones. Because instructions are unlabeled
and their responses are natural text, traditional active learning schemes with
the model's confidence cannot be directly applied to the selection of unlabeled
instructions. In this work, we propose a novel method for instruction
selection, called SelectLLM, that leverages LLMs for the selection of
high-quality instructions. Our high-level idea is to use LLMs to estimate the
usefulness and impactfulness of each instruction without the corresponding
labels (i.e., responses), via prompting. SelectLLM involves two steps: dividing
the unlabelled instructions using a clustering algorithm (e.g., CoreSet) to
multiple clusters, and then prompting LLMs to choose high-quality instructions
within each cluster. SelectLLM showed comparable or slightly better performance
on the popular instruction benchmarks, compared to the recent state-of-the-art
selection methods. All code and data are publicly available
(https://github.com/minnesotanlp/select-llm).
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要