Chinese relation extraction in military field based on multi-grained lattice transformer and imbalanced data classification

International Journal of Modeling, Simulation, and Scientific Computing(2023)

引用 0|浏览6
暂无评分
摘要
Relation extraction (RE) is a crucial step for knowledge graph construction, which aims to extract meaningful relations between entity pairs in plain texts. Very few works have been studied on Chinese relation extraction (CRE) in the military field. Moreover, recent deep neural network-based methods have achieved considerable performance but still suffer from three inherent limitations, including overlapping of entities, imbalanced data and the ambiguity. Therefore, this work investigates a novel Multi-Grained Lattice Transformer (MGLT), which leverages external information of lexicon and word sense tailored for CRE. In MGLT, self-matched lexicon words and related word senses are fused through a cross-transformer mechanism to alleviate the ambiguity in texts. The finally enriched sequence representation in MGLT captures the relatedness between the head entity and the tail one, which is helpful to alleviate the overlapping of entities. Experimental results on two benchmark datasets and a self-developed dataset constructed from online military news show that the proposed MGLT achieves state-of-the-art (SOTA) performance. Compared with other typical baselines, MGLT achieves better area under curve (AUC) and F1-score by up to 10.46% and 6.90%, respectively. We further demonstrate the effectiveness of using ensemble learning to fully exploit complementary information from multiple MGLT-based base learners to improve the overall performance for imbalanced data classification on the military dataset. Such results indicate that the proposed ensemble learning model is effective and robust to be applied in practical applications.
更多
查看译文
关键词
chinese relation extraction,military field,classification,multi-grained
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要