Multi-objective genetic algorithm for missing data imputation

Pattern Recognition Letters(2015)

引用 68|浏览59
暂无评分
摘要
The paper proposes a novel Multi-objective Genetic Algorithm for Data Imputation, called MOGAImp.This is the first method that applies a multi-objective approach to data imputation.MOGAImp presents a good tradeoff between the evaluation measures studied.The results confirm the MOGAImp prevalence for utilization over conflicting evaluation measures.MOGAImp codification scheme makes possible to adapt it to different application domains. A large number of techniques for data analyses have been developed in recent years, however most of them do not deal satisfactorily with a ubiquitous problem in the area: the missing data. In order to mitigate the bias imposed by this problem, several treatment methods have been proposed, highlighting the data imputation methods, which can be viewed as an optimization problem where the goal is to reduce the bias caused by the absence of information. Although most imputation methods are restricted to one type of variable whether categorical or continuous. To fill these gaps, this paper presents the multi-objective genetic algorithm for data imputation called MOGAImp, based on the NSGA-II, which is suitable for mixed-attribute datasets and takes into account information from incomplete instances and the modeling task. A set of tests for evaluating the performance of the algorithm were applied using 30 datasets with induced missing values; five classifiers divided into three classes: rule induction learning, lazy learning and approximate models; and were compared with three techniques presented in the literature. The results obtained confirm the MOGAImp outperforms some well-established missing data treatment methods. Furthermore, the proposed method proved to be flexible since it is possible to adapt it to different application domains.
更多
查看译文
关键词
Missing data,Data imputation,Multi-objective evolutionary algorithm,Genetic algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要