A pancreatic cancer risk prediction model (Prism) developed and validated on large-scale US clinical dataResearch in context

Kai Jia, Steven Kundrot,Matvey B. Palchuk, Jeff Warnick, Kathryn Haapala,Irving D. Kaplan,Martin Rinard,Limor Appelbaum

EBioMedicine(2023)

引用 0|浏览16
暂无评分
摘要
Summary: Background: Pancreatic Duct Adenocarcinoma (PDAC) screening can enable early-stage disease detection and long-term survival. Current guidelines use inherited predisposition, with about 10% of PDAC cases eligible for screening. Using Electronic Health Record (EHR) data from a multi-institutional federated network, we developed and validated a PDAC RISk Model (Prism) for the general US population to extend early PDAC detection. Methods: Neural Network (PrismNN) and Logistic Regression (PrismLR) were developed using EHR data from 55 US Health Care Organisations (HCOs) to predict PDAC risk 6–18 months before diagnosis for patients 40 years or older. Model performance was assessed using Area Under the Curve (AUC) and calibration plots. Models were internal-externally validated by geographic location, race, and time. Simulated model deployment evaluated Standardised Incidence Ratio (SIR) and other metrics. Findings: With 35,387 PDAC cases, 1,500,081 controls, and 87 features per patient, PrismNN obtained a test AUC of 0.826 (95% CI: 0.824–0.828) (PrismLR: 0.800 (95% CI: 0.798–0.802)). PrismNN's average internal-external validation AUCs were 0.740 for locations, 0.828 for races, and 0.789 (95% CI: 0.762–0.816) for time. At SIR = 5.10 (exceeding the current screening inclusion threshold) in simulated model deployment, PrismNN sensitivity was 35.9% (specificity 95.3%). Interpretation: Prism models demonstrated good accuracy and generalizability across diverse populations. PrismNN could find 3.5 times more cases at comparable risk than current screening guidelines. The small number of features provided a basis for model interpretation. Integration with the federated network provided data from a large, heterogeneous patient population and a pathway to future clinical deployment. Funding: Prevent Cancer Foundation, TriNetX, Boeing, DARPA, NSF, and Aarno Labs.
更多
查看译文
关键词
Pancreatic cancer,Risk prediction,Machine learning,Electronic health records,Federated data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要