Comparing the Wrapper Feature Selection Evaluators on Twitter Sentiment Classification

N K Suchetha, Anupama Nikhil,P Hrudya

2019 International Conference on Computational Intelligence in Data Science (ICCIDS)(2019)

Cited 26|Views6
No score
Abstract
The application of machine learning algorithms on text data is challenging in several ways, the greatest being the presence of sparse, high dimensional feature set. Feature selection methods are effective in reducing the dimensionality of the data and helps in improving the computational efficiency and the performance of the learned model. Recently, evolutionary computation (EC) methods have shown success in solving the feature selection problem. However, due to the requirement of a large number of evaluations, EC based feature selection methods on text data are computationally expensive. This paper examines the different evaluation classifiers used for EC based wrapper feature selection methods. A two-stage feature selection method is applied to twitter data for sentiment classification. In the first stage, a filter feature selection method based on Information Gain (IG) is applied. During the second stage, a comparison is made between 4 different EC feature selection methods, Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), Cuckoo Search (CS) and Firefly Search, with different classifiers as subset evaluators. LibLinear, K Nearest neighbours (KNN) and Naive Bayes (NB) are the classifiers used for wrapper feature subset evaluation. Also, the time required for evaluating the feature subset for the chosen classifiers is computed. Finally, the effect of the application of this combined feature selection approach is evaluated using six different learners. Results demonstrate that LibLinear is computationally efficient and achieves the best performance.
More
Translated text
Key words
Feature Selection,Information Gain,Evolutionary Computation,Sentiment Analysis
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined