Low-Resource NER by Data Augmentation With Prompting

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence(2022)

引用 0|浏览2
暂无评分
摘要
Named entity recognition (NER) is a fundamental information extraction task that seeks to identify entity mentions of certain types in text. Despite numerous advances, the existing NER methods rely on extensive supervision for model training, which struggle in a low-resource scenario with limited training data. In this paper, we propose a new data augmentation method for low-resource NER, by eliciting knowledge from BERT with prompting strategies. Particularly, we devise a label-conditioned word replacement strategy that can produce more label-consistent examples by capturing the underlying word-label dependencies, and a prompting with question answering method to generate new training data from unlabeled texts. The experimental results have widely confirmed the effectiveness of our approach. Particularly, in a low-resource scenario with only 150 training sentences, our approach outperforms previous methods without data augmentation by over 40% in F1 and prior best data augmentation methods by over 2.0% in F1. Furthermore, our approach also fits with a zero-shot scenario, yielding promising results without using any human-labeled data for the task.
更多
查看译文
关键词
data augmentation,prompting,low-resource
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要