Curating retrospective multimodal and longitudinal data for community cohorts at risk for lung cancer.

Cancer biomarkers : section A of Disease markers(2024)

Cited 0|Views18
No score
Abstract
BACKGROUND:Large community cohorts are useful for lung cancer research, allowing for the analysis of risk factors and development of predictive models. OBJECTIVE:A robust methodology for (1) identifying lung cancer and pulmonary nodules diagnoses as well as (2) associating multimodal longitudinal data with these events from electronic health record (EHRs) is needed to optimally curate cohorts at scale. METHODS:In this study, we leveraged (1) SNOMED concepts to develop ICD-based decision rules for building a cohort that captured lung cancer and pulmonary nodules and (2) clinical knowledge to define time windows for collecting longitudinal imaging and clinical concepts. We curated three cohorts with clinical data and repeated imaging for subjects with pulmonary nodules from our Vanderbilt University Medical Center. RESULTS:Our approach achieved an estimated sensitivity 0.930 (95% CI: [0.879, 0.969]), specificity of 0.996 (95% CI: [0.989, 1.00]), positive predictive value of 0.979 (95% CI: [0.959, 1.000]), and negative predictive value of 0.987 (95% CI: [0.976, 0.994]) for distinguishing lung cancer from subjects with SPNs. CONCLUSION:This work represents a general strategy for high-throughput curation of multi-modal longitudinal cohorts at risk for lung cancer from routinely collected EHRs.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined