A filter-based feature selection approach in multilabel classification

Machine Learning: Science and Technology(2023)

引用 0|浏览5
暂无评分
摘要
Multi-label classification is a fast-growing field of machine learning. Recent developments have shown several applications, including social media, healthcare, bio-molecular analysis, scene, and music classification associated with the multilabel classification. In classification problems, multiple labels (multilabel or more than one class label) are assigned to an unseen record instead of a single-label class assignment. Feature selection is a preprocessing phase used to identify the most relevant features that could improve the accuracy of the multilabel classifiers. The focus of this study is the feature selection method in multilabel classification. The study used a feature selection filter method involving the Fisher score, analysis of variance test, mutual information, Chi-Square, and ensembles of these statistical methods. An extensive range of machine learning algorithms is applied in the modelling phase of a multilabel classification model that includes binary relevance, classifier chain, label powerset, binary relevance KNN, multi-label twin support vector machine, multi-label KNN. Besides, label space partitioning and majority voting of ensemble methods are used and Random Forest is the base learner. The experiments are carried out over five different multilabel benchmarking datasets. The evaluation of the classification model is measured using accuracy, precision, recall, F1 score, and hamming loss. The study demonstrated that the filter methods (i.e. mutual information) having top weighted $80\%$ to $20\%$ features provided significant outcomes.
更多
查看译文
关键词
feature selection approach,classification,filter–based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要