CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News
International Conference on Computational Linguistics(2024)
摘要
Extracting structured event knowledge, including event triggers and
corresponding arguments, from military texts is fundamental to many
applications, such as intelligence analysis and decision assistance. However,
event extraction in the military field faces the data scarcity problem, which
impedes the research of event extraction models in this domain. To alleviate
this problem, we propose CMNEE, a large-scale, document-level open-source
Chinese Military News Event Extraction dataset. It contains 17,000 documents
and 29,223 events, which are all manually annotated based on a pre-defined
schema for the military domain including 8 event types and 11 argument role
types. We designed a two-stage, multi-turns annotation strategy to ensure the
quality of CMNEE and reproduced several state-of-the-art event extraction
models with a systematic evaluation. The experimental results on CMNEE fall
shorter than those on other domain datasets obviously, which demonstrates that
event extraction for military domain poses unique challenges and requires
further research efforts. Our code and data can be obtained from
https://github.com/Mzzzhu/CMNEE.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要