Regularized regression for two phase failure time studies

Computational Statistics & Data Analysis(2023)

引用 0|浏览18
暂无评分
摘要
Two-phase study designs are ideal for focused sub-studies based on large prospective cohorts when the outcome of interest is an event that is rare in the full cohort, and additional covariates are expensive or difficult to measure. Researchers often wish to examine large numbers of covariates for association with outcomes of interest. In the context of cancer, hundreds to millions of genetic markers may be considered, along with environmental exposures. A computationally efficient variable selection method is proposed for two-phase failure time studies with stratified sampling under the Cox proportional hazards model. The penalized estimator is obtained from a penalized (weighted) Cox log partial likelihood using a pathwise cyclical coordinate descent algorithm which is scalable for high dimensional datasets where the number of features is much larger than the sample size (p >> n). A detailed simulation study to examine the performance of the proposed methodology is described. The variable selection and estimation procedure is then used to obtain a model for predicting acute myeloid leukaemia using somatic stem cell mutation profiles derived from blood samples, based on a two-phase sample from the European Prospective Investigation into Cancer and Nutrition (EPIC) study. (c) 2023 Elsevier B.V. All rights reserved.
更多
查看译文
关键词
Two-phase studies,Case-cohort,Penalized regression,Cox models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要