Active Entity Recognition in Low Resource Settings

Proceedings of the 28th ACM International Conference on Information and Knowledge Management(2019)

引用 5|浏览31
暂无评分
摘要
The task of Named Entity Recognition (NER) has been well studied under high-resource conditions (e.g., extracting named mentions of PERSON, ORGANIZATION and LOCATION from news articles). However, there are very few studies of the NER task for open-domain collections and in low-resource settings. We focus on NER for low-resource collections, in which any entity types of practical interest to the users of the system must be supported. We try to achieve this with a low cost of annotation of data from the target domain/collection. We propose an entity recognition framework that combines active learning and conditional random fields (CRF), and which provides the flexibility to define new entity types as needed by the users. Our experiments on a help & support corpus show that the system can achieve F1 measure of 0.77 by relying on only 100 manually-annotated sentences.
更多
查看译文
关键词
active learning, low resource settings, named entity recognition, open domain
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要