Genetic algorithm rule based categorization method for textual data mining

Decision Science Letters(2020)

引用 5|浏览1
暂无评分
摘要
The rule based categorization approaches such as associative classification have the capability to produce classifiers rival to those learned by traditional categorization approaches such as Naive Bayes and K-nearest Neighbor. However, the lack of useful discovery and usage of categorization rules are the major challenges of rule based approaches and their performance is declined with large set of rules. Genetic Algorithm (GA) is effective to reduce the high dimensionality and improve categorization performance. However, the usage of GA in most researches is limited in the categorization preprocessing stage and its results is used to simplify the categorization process performed by other categorization algorithms. This paper proposed a hybrid GA rule based categorization method, named genetic algorithm rule based categorization (GARC), to enhance the accuracy of categorization rules and to produce accurate classifier for text mining. The GARC consists of three main stages; namely, search space determination, rule discovery with validation (rule generation), and categorization. The experimental results are carried out on three Arabic text datasets with multiple categories to evaluate the efficiency of GARC. The results show that a promising performance was achieved by using GARC for Arabic text categorization. The GARC achieves the best performance with small feature space in most situations. (C) 2020 by the authors; licensee Growing Science, Canada.
更多
查看译文
关键词
Rule based categorization, Text categorization, Genetic Algorithm, Rule discovery, Categorization rule, Associative classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要