Extractive Text Summarization Using Formality of Language

Research Square (Research Square)(2023)

引用 0|浏览3
暂无评分
摘要
Abstract Automatic text summarization is a more than decade-old research method to get a summary from the larger descriptive text, and the cam- paign to find the best approach is still going on. Much research has been done, and new techniques are emerging to get a summary from the text document. Many unsupervised methods do not give justice to the lan- guage with high inflections. We aim to provide a unique technique that gives justice to complex morphological language. We present a novel method, the formality of a language, to generate a summary. We have used an already researched formality formula to measure the formality of language but customized it to suit an Indo-Iranian language family. We have proved that our redefined formula works with a 7.28% higher formality score than informal language. We also ran a statistical sig- nificance test to confirm the results. We apply this formula to measure whether the sentence is formal or informal. Our customized formula mix- ture with other measures like Shannon entropy score, and the presence of numeric values, we generate a score of a sentence. Based on scoring, we generate a text summary. We experimented formula on the generated Gujarati corpus and collected the results. We compared our technique with similar methods on ideal compression ratios like 20% and 30%. We got the highest accuracy with an increase of 14.63% compared to other methods on rouge-1 with a 30% compression ratio and a 28.60% increase in rouge-1 with a 20% compression ratio. Detailed experiment values with all the results have been provided for further study and research.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要