Unsupervised Numerical Information Extraction via Exploiting Syntactic Structures

Electronics(2023)

引用 0|浏览0
暂无评分
摘要
Numerical information plays an important role in various fields such as scientific, financial, social, statistics, and news. Most prior studies adopt unsupervised methods by designing complex handcrafted pattern-matching rules to extract numerical information, which can be difficult to scale to the open domain. Other supervised methods require extra time, cost, and knowledge to design, understand, and annotate the training data. To address these limitations, we propose QuantityIE, a novel approach to extracting numerical information as structured representations by exploiting syntactic features of both constituency parsing (CP) and dependency parsing (DP). The extraction results may also serve as distant supervision for zero-shot model training. Our approach outperforms existing methods from two perspectives: (1) the rules are simple yet effective, and (2) the results are more self-contained. We further propose a numerical information retrieval approach based on QuantityIE to answer analytical queries. Experimental results on information extraction and retrieval demonstrate the effectiveness of QuantityIE in extracting numerical information with high fidelity.
更多
查看译文
关键词
numerical information,information extraction,syntactic parsing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要