Comparison of static and dynamic random forests models for EHR data in the presence of competing risks: predicting central line-associated bloodstream infection
arxiv(2024)
摘要
Prognostic outcomes related to hospital admissions typically do not suffer
from censoring, and can be modeled either categorically or as time-to-event.
Competing events are common but often ignored. We compared the performance of
random forest (RF) models to predict the risk of central line-associated
bloodstream infections (CLABSI) using different outcome operationalizations. We
included data from 27478 admissions to the University Hospitals Leuven,
covering 30862 catheter episodes (970 CLABSI, 1466 deaths and 28426 discharges)
to build static and dynamic RF models for binary (CLABSI vs no CLABSI),
multinomial (CLABSI, discharge, death or no event), survival (time to CLABSI)
and competing risks (time to CLABSI, discharge or death) outcomes to predict
the 7-day CLABSI risk. We evaluated model performance across 100 train/test
splits. Performance of binary, multinomial and competing risks models was
similar: AUROC was 0.74 for baseline predictions, rose to 0.78 for predictions
at day 5 in the catheter episode, and decreased thereafter. Survival models
overestimated the risk of CLABSI (E:O ratios between 1.2 and 1.6), and had
AUROCs about 0.01 lower than other models. Binary and multinomial models had
lowest computation times. Models including multiple outcome events (multinomial
and competing risks) display a different internal structure compared to binary
and survival models. In the absence of censoring, complex modelling choices do
not considerably improve the predictive performance compared to a binary model
for CLABSI prediction in our studied settings. Survival models censoring the
competing events at their time of occurrence should be avoided.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要