An Approach to Cluster Tamil Literatures Using Discourse Connectives

2019 IEEE 1st International Conference on Energy, Systems and Information Processing (ICESIP)(2019)

Cited 1|Views0
No score
Abstract
Tamil literary works have many valuable information hidden in it which may be a guide to even to the current modern society. Tamil literary works prove to be a treasure that retains the historical values and also a solution provider to almost each and every aspect of life such as, traditions, love, war, governance, trade etc. Now a days, many Tamil literary works have been translated into various world languages. When such information is made available to the global society through World Wide Web (WWW), it requires computational analysis which needs computational analysis of Tamil literary works. Clusters of Tamil literary works can be one such kind of computational analysis in order to be accessed by the Search Engine. This paper proposes an approach to cluster the Tamil Literatures using K-Means Clustering using discourse connectives as the predominant features. Discourse connectives are the words which connect two coherent clauses, phrases or sentences. The identification of discourse connectives signal the semantic relation between the texts fragments. This is explored in clustering the semantically similar Thirukkurals which is a famous Tamil literary work. This clusters of Thirukkural can form basis to identify efficient indices for many Natural Language Processing (NLP) applications, such as, Information Retrieval Systems, Summary Generation Systems, and Question Answering Systems. The proposed work is tested using all 1330 Thirukkural couplets and it has been found that 1050 Thirukkural couplets are correctly clustered into five clusters. This research has been evaluated using cluster purity metric and has achieved 0.79 purity, overall rand index of 0.92, precision of 0.79, recall of 0.8, and F-score of 0.79.
More
Translated text
Key words
Discourse connectives,K-Means Clustering,Tamil literature
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined