Multi-language Information Extraction with Text Pattern Recognition

Natural Language Computing(2021)

引用 0|浏览0
暂无评分
摘要
Information extraction is a task that can extract meta-data information from text. The research in this article proposes a new information extraction algorithm called GenerateIE. The proposed algorithm identifies pairs of entities and relations described in a piece of text. The extracted meta-data is useful in many areas, but within this research the focus is to use them in news-media contexts to provide the gist of the written articles for analytics and paraphrasing of news information. GenerateIE algorithm is compared with existing state of the art algorithms with two benefits. Firstly, the GenerateIE provides the co-referenced word as the entity instead of using he, she, it, etc. which is more beneficial for knowledge graphs. Secondly GenerateIE can be applied on multiple languages without changing the algorithm itself apart from the underlying natural language text-parsing. Furthermore, the performance of GenerateIE compared with state-of-the-art algorithms is not significantly better, but it offers competitive results.
更多
查看译文
关键词
text pattern recognition,extraction,information,multi-language
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要