“Great in, great out” is the new “garbage in, garbage out”: subsampling from data with no response variable using various approaches, including unsupervised learning

2021 International Conference on Computing, Computational Modelling and Applications (ICCMA)(2021)

引用 0|浏览0
暂无评分
摘要
When having more initial data than needed, an appropriate selection of a subpopulation from the given dataset, i. e. subsampling, follows, usually intending to ensure all categorical variables’ levels are nearly equally covered and all numerical variables are all well balanced. If a response variable in the original data is missing, popular propensity scoring cannot be performed.This study address...
更多
查看译文
关键词
subsampling,exhaustive subsampling,random subsampling,unsupervised learning,clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要