A Model-Averaging Method For High-Dimensional Regression With Missing Responses At Random

STATISTICA SINICA(2021)

引用 4|浏览9
暂无评分
摘要
This study considers the ultrahigh-dimensional prediction problem in the presence of responses missing at random. A two-step model-averaging procedure is proposed to improve the prediction accuracy of the conditional mean of the response variable. The first step specifies several candidate models, each with low-dimensional predictors. To implement this step, a new feature-screening method is developed to distinguish between the active and inactive predictors. The method uses the multiple-imputation sure independence screening (MI-SIS) procedure, and candidate models are formed by grouping covariates with similar size MI-SIS values. The second step develops a new criterion to find the optimal weights for averaging a set of candidate models using weighted delete-one cross-validation (WDCV). Under some regularity conditions, we show that the proposed screening statistic enjoys the ranking consistency property, and that the WDCV criterion asymptotically achieves the lowest possible prediction loss. Simulation studies and an example demonstrate the proposed methodology.
更多
查看译文
关键词
High-dimensional data, missing at random, model averaging, multiple imputation, screening, weighted delete-one cross-validation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要