Automatic Text Summarization for Malay News Documents Using Latent Dirichlet Allocation and Sentence Selection Algorithm

Nurazzah Abd Rahman, Siti Nur Afiqah Ramlam, Natasha Aleza Azhar,Haslizatul Mohamed Hanum,Noor Ida Ramli, Najahudin Lateh

2021 Fifth International Conference on Information Retrieval and Knowledge Management (CAMP)(2021)

Cited 0|Views0
No score
Abstract
The proliferation of internet newspapers making an Automatic Text Summarization is now a need to produce a summary that contains most of the important information from the original document. This study focused on the keyword extraction using Latent Dirichlet Allocation and Sentence Selection that used rule based concept approach to produce extractive summary. 100 Malay news documents covering general, sports, health and technology were collected from Utusan Online to evaluate the effectiveness of the system. This study used a single topic from LDA and top 10 words in the selected topic as the keywords. To evaluate, summary generated by the system was compared to summary generated by human expert using Precision Recall formula. The results showed the effectiveness of the summary generated by the system which is the best score 62.7 % that can help people read the Malay news documents in short time as the summary assist the readers to understand the important parts of the document without reading the whole document.
More
Translated text
Key words
Information Retrieval,Text Summarization,Topic Modelling,LDA,Sentence Selection Algorithm,Malay Document Retrieval
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined