Modeling Unified Semantic Discourse Structure for High-quality Headline Generation
CoRR(2024)
摘要
Headline generation aims to summarize a long document with a short, catchy
title that reflects the main idea. This requires accurately capturing the core
document semantics, which is challenging due to the lengthy and background
information-rich na ture of the texts. In this work, We propose using a unified
semantic discourse structure (S3) to represent document semantics, achieved by
combining document-level rhetorical structure theory (RST) trees with
sentence-level abstract meaning representation (AMR) graphs to construct S3
graphs. The hierarchical composition of sentence, clause, and word
intrinsically characterizes the semantic meaning of the overall document. We
then develop a headline generation framework, in which the S3 graphs are
encoded as contextual features. To consolidate the efficacy of S3 graphs, we
further devise a hierarchical structure pruning mechanism to dynamically screen
the redundant and nonessential nodes within the graph. Experimental results on
two headline generation datasets demonstrate that our method outperforms
existing state-of-art methods consistently. Our work can be instructive for a
broad range of document modeling tasks, more than headline or summarization
generation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要