Cross-validation: the illusion of reliable performance estimation

semanticscholar(2010)

引用 0|浏览0
暂无评分
摘要
In data mining, we are often faced with the task of estimating model performance from training data. This estimation is supposed to express the expectation of the performance on future, previously unseen data and it is very much needed for business decisions and also for the analyst to compare different models. One of the most widely used performance estimation technique is cross-validation which has more and more misuse in these days. This paper describes common mistakes in using cross-validation that significantly obfuscate the estimations, presents several numerical examples on how misleading the estimation can be, and propose a data mining process for ensuring valid performance esti-
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要