Chrome Extension
WeChat Mini Program
Use on ChatGLM

Computational modeling for censored time to event data using data integration in biomedical research

Computational modeling for censored time to event data using data integration in biomedical research(2011)

Cited 23|Views7
No score
Abstract
Medical prognostic models are designed by clinicians to predict the future course or outcome of disease progression after diagnosis or treatment. The data, which are used when these clinical models are developed, are required to contain a high number of events per variable (EPV) for the resulting model to be reliable. If our objective is to optimize predictive performance by some criterion, we can often achieve a reduced model that has a little bias with low variance, but whose overall performance is improved. To accomplish this goal, we propose a new variable selection approach that combines Stepwise Tuning in the Maximum Concordance Index (STMC) and Forward Nested Subset Selection (FNSS) in two stages. In the first stage, the proposed variable selection is employed to identify the best subset of risk factors optimized with the concordance index using inner cross validation for optimism correction in the outer loop of cross validation, yielding potentially different final models for each of the folds. We then feed the intermediate results of the prior stage into another selection method in the second stage to resolve the overfitting problem and to select a final model from the variation of predictors in the selected models. Two case studies on relatively different sized survival data sets as well as a simulation study demonstrate that the proposed approach is able to select an improved and reduced average model under a sufficient sample and event size compared to other selection methods such as stepwise selection using the likelihood ratio test, Akaike Information Criterion (AIC), and least absolute shrinkage and selection operator (lasso). Finally, we achieve improved final models in each dataset as compared full models according to most criteria. These results of the model selection models and the final models were analyzed in a systematic scheme through validation for independent performance evaluation. For the second part of this dissertation, we build prognostic models that use clinicopathologic features and predict prognosis after a certain treatment. Most of the recent research efforts have focused on high dimensional genomic data with a small sample. Since clinically similar but molecularly heterogeneous tumors may produce different clinical outcomes, the combination of clinical and genomic information is crucial to improve the quality of prognostic prediction. However, there is lack of an integrating scheme into a clinico-genomic model due to the larger number of variables and small sample size, in particular, for a parsimonious model. We propose a methodology to build a reduced yet accurate integrative model using a hybrid approach based on the Cox regression model, which uses several dimension reduction techniques, L2 penalized maximum likelihood estimation (PMLE), and resampling methods to tackle the problems above. The predictive accuracy of the modeling approach is assessed by several metrics via an independent and thorough scheme to compare competing methods. In breast cancer data studies for metastasis and mortality outcome, in a DLBCL data study, and in simulation studies, we demonstrate that the proposed methodology can improve prediction accuracy and build a final model with a hybrid signature that is parsimonious when integrating both types of variables. The selected clinical factors and genomic biomarkers are found to be highly relevant to the biological processes and can be considered as potential biomarkers for cancer prognosis and therapy. Furthermore, selected but unidentified genes are open to thorough investigation.
More
Translated text
Key words
final model,clinical model,computational modeling,medical prognostic model,biomedical research,average model,selection method,full model,Cox regression model,accurate integrative model,clinico-genomic model,data integration,event data,different final model
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined