Chrome Extension
WeChat Mini Program
Use on ChatGLM

Improving extractive summarization with semantic enhancement through topic-injection based BERT model

Yiming Wang, Jindong Zhang, Zhiyao Yang, Bing Wang, Jingyi Jin, Yitong Liu

INFORMATION PROCESSING & MANAGEMENT(2024)

Cited 0|Views7
No score
Abstract
In the field of text summarization, extractive techniques aim to extract key sentences from a document to form a summary. However, traditional methods are not sensitive enough to obtain the core semantics of the text, resulting in summaries that contain complicate comprehension. Recently, topic extraction technology extracts core semantics from text, enabling accurate summaries of the main points of a document. In this paper, we introduce the Topic -Injected Bidirectional Encoder Representations from Transformers (TP-BERT), a novel neural auto -encoder model designed explicitly for extractive summarization. TP-BERT integrates document -related topic words into sentences, improving contextual understanding and more accurately aligning summaries with a document's main theme, addressing a key shortfall in traditional extractive methods. Another major innovation of TP-BERT is the use of contrastive learning during training. This method enhances summarization efficiency by giving prominence to key sentences and minimizing peripheral information. Additionally, we conducted ablation studies and parameter studies of TP-BERT conducted on the CNN/DailyMail, WikiHow, and XSum datasets. In our two main experiments, the average ROUGE -F1 score improved by 2.69 and 0.45 across the three datasets. In comparison to baseline methods, TP-BERT has demonstrated better performance based on the increase in ROUGE -F1 scores on three datasets. Moreover, the semantic differentiation between sentence representations has also contributed positively to the performance enhancements.
More
Translated text
Key words
Extractive summarization,Topic model,Transformer,Information fusion
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined