Late acceptance hill climbing aided chaotic harmony search for feature selection: An empirical analysis on medical data

Expert Syst. Appl.(2023)

Cited 2|Views7
No score
Abstract
In today's era of data-driven digital society, there is a huge demand for optimized solutions that essentially reduce the cost of operation, thereby aiming to increase productivity. Processing a huge amount of data, like the Microarray based gene expression data, using machine learning and data mining algorithms has certain limitations in terms of memory and time requirements. This would be more concerning, when a dataset comes with redundant and non-important information. For example, many report-based medical datasets have several non-informative attributes which mislead the classification algorithms. To this end, researchers have been developing several feature selection algorithms that try to discard the redundant information from the raw datasets before feeding them to machine learning algorithms. Metaheuristic based optimization algorithms provide an excellent option to solve feature selection problems. In this paper, we propose a music-inspired harmony search (HS) algorithm based wrapper feature selection method. At the beginning, we use a chaotic mapping to initialize the population of the HS algorithm in order to better coverage of the search space. Further to complement the inferior exploitation of the HS algorithm, we integrate it with the Late Acceptance Hill Climbing (LAHC) method. Thus the combination of these two algorithms provides a good balance between the exploration and exploitation of the HS algorithm. We evaluate the proposed feature selection method on 15 UCI datasets and the obtained results are found to be better than many state-of-the-art methods both in terms of the classification accuracy and the number of features selected. To evaluate the effectiveness of our algorithm, we utilize a combination of precision, recall, F1 score, fitness value, and execution time as performance indicators. These metrics enable us to obtain a comprehensive assessment of the algorithm's abilities and limitations. We also apply our method on 3 microarray based gene expression datasets used for prediction of cancer to ensure the scalability and robustness as a feature selection method in real-life scenarios. In addition to this, we test our approach using the COVID-19 dataset, and it performs better than several metaheuristic based optimization techniques.
More
Translated text
Key words
Feature selection,Microarray data,Harmony search,Late Acceptance Hill Climbing,Metaheuristics,COVID-19 data,Optimization,Algorithm
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined