ISCA: An Improved Sine Cosine Algorithm to select features for text categorization

Journal of King Saud University - Computer and Information Sciences(2020)

引用 59|浏览4
暂无评分
摘要
Bag of words model is commonly used for text categorization. The main problem of this model lies in the large number of involved features, which influences the categorization task performance. To deal with this problem, feature selection method is necessary. Feature selection is beneficial for reducing the dimensionality of the problem, it leads to minimize the computational time and improve the performance of the categorization task. In this paper, we propose a new improved algorithm of the original Sine Cosine Algorithm (SCA) for feature selection, which allows for better exploration in the search space. Unlike the SCA which focuses only on the best solution to generate a new solution, the new algorithm (ISCA) of our proposal takes into account two positions of the solution. (i), The position of the best solution found so far, and (ii), a given random position from the search space. This combination allows us to propose a simple algorithm which is able to avoid premature convergence and obtain very satisfactory performance. To validate the new ISCA algorithm, we carried out a series of experiments on nine text collection, where, we compared the experimental results with several search algorithms including the original SCA algorithm and some of its improved versions as well as the Moth-Flam Optimizer (MFO) algorithm. Moreover, from the state of the art, the Genetic Algorithm (GA) and the Ant Colony Optimization (ACO) are chosen in our comparative study. Our evaluation results demonstrate the high performance of our proposed ISCA algorithm which makes it very useful for text categorization problem.
更多
查看译文
关键词
Text categorization,Information gain,Feature subset selection,Wrapper methods,Improved sine cosine algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要