CEntRE: A paragraph-level Chinese dataset for Relation Extraction among Enterprises

arxiv(2023)

引用 0|浏览5
暂无评分
摘要
Enterprise relation extraction aims to detect pairs of enterprise entities and identify the business relations between them from unstructured or semi-structured text data, and it is crucial for several real-world applications such as risk analysis, rating research and supply chain security. However, previous work mainly focuses on getting attribute information about enterprises like personnel and corporate business, and pays little attention to enterprise relation extraction. To encourage further progress in the research, we introduce the CEntRE, a new dataset constructed from publicly available business news data with careful human annotation and intelligent data processing. Moreover, we propose a joint entity and relation extraction network, which is capable of discovering enterprise entities and extracting business relations between them accurately. The network firstly encodes input sequences with strong semantic augmentation to learn contextual representation for each token, then a conditional random field (CRF) module is used for entity extraction. Subsequently, entity pairs are built and a new encoder based on the entity pairs is applied to get global information for relation extraction. Finally, a biaffine classifier is deployed to classify the relations. Extensive experiments on CEntRE demonstrate the effectiveness of our proposed method compared with other six excellent models, and thus our model can be considered as one strong baseline. The data and code are available at: https://github.com/LiuPeiP-CStMining_Entity_Relations_Among_Enterprises
更多
查看译文
关键词
corporate business,enterprise entities,enterprise relation extraction,entity extraction,entity pairs,extracting business relations,intelligent data processing,joint entity,paragraph-level chinese dataset,publicly available business news data,semistructured text data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要