Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning
arxiv(2024)
摘要
While Large Language Models (LLMs) exhibit remarkable capabilities in
zero-shot and few-shot scenarios, they often require computationally
prohibitive sizes. Conversely, smaller Masked Language Models (MLMs) like BERT
and RoBERTa achieve state-of-the-art results through fine-tuning but struggle
with extending to few-shot and zero-shot settings due to their architectural
constraints. Hence, we propose Statement-Tuning, a technique that models
discriminative tasks as a set of finite statements and trains an Encoder model
to discriminate between the potential statements to determine the label. We do
Statement-Tuning on multiple tasks to enable cross-task generalization.
Experimental results demonstrate that Statement Tuning achieves competitive
performance compared to state-of-the-art LLMs with significantly fewer
parameters. Moreover, the study investigates the impact of several design
choices on few-shot and zero-shot generalization, revealing that Statement
Tuning can achieve sufficient performance with modest training data and
benefits from task and statement diversity for unseen task generalizability.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要