Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences

APPLIED SCIENCES-BASEL(2023)

引用 0|浏览9
暂无评分
摘要
Featured Application Hierarchical clause annotation could be applied in many downstream tasks of natural language processing, including abstract meaning representation parsing, semantic dependency parsing, text summarization, argument mining, information extraction, question answering, machine translation, etc.Most natural-language-processing (NLP) tasks suffer performance degradation when encountering long complex sentences, such as semantic parsing, syntactic parsing, machine translation, and text summarization. Previous works addressed the issue with the intuition of decomposing complex sentences and linking simple ones, such as rhetorical-structure-theory (RST)-style discourse parsing, split-and-rephrase (SPRP), text simplification (TS), simple sentence decomposition (SSD), etc. However, these works are not applicable for semantic parsing such as abstract meaning representation (AMR) parsing and semantic dependency parsing due to misalignments with semantic relations and unavailabilities to preserve the original semantics. Following the same intuition and avoiding the deficiencies of previous works, we propose a novel framework, hierarchical clause annotation (HCA), for capturing clausal structures of complex sentences, based on the linguistic research of clause hierarchy. With the HCA framework, we annotated a large HCA corpus to explore the potentialities of integrating HCA structural features into semantic parsing with complex sentences. Moreover, we decomposed HCA into two subtasks, i.e., clause segmentation and clause parsing, and provide neural baseline models for more-silver annotations. In evaluating the proposed models on our manually annotated HCA dataset, the performances of clause segmentation and parsing resulted in 91.3% F1-scores and 88.5% Parseval scores, respectively. Due to the same model architectures employed, the performance differences of the clause/discourse segmentation and parsing subtasks was reflected in our HCA corpus and compared discourse corpora, where our sentences contained more segment units and fewer interrelations than those in the compared corpora.
更多
查看译文
关键词
clause hierarchy, hierarchical clause annotation, complex sentences, syntactic parsing, semantic parsing, RST corpus
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要