Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning

David S. Carrell,Susan Gruber,James S. Floyd,Maralyssa A. Bann,Kara L. Cushing-Haugen,Ron L. Johnson,Vina Graham,David J. Cronkite,Brian L. Hazlehurst,Andrew H. Felcher,Cosmin A. Bejan,Adee Kennedy,Mayura U. Shinde,Sara Karami,Yong Ma,Danijela Stojanovic,Yueqin Zhao,Robert Ball,Jennifer C. Nelson

AMERICAN JOURNAL OF EPIDEMIOLOGY（2023）

引用 4|浏览7

暂无评分

摘要

We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cross-validated AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cross-validated AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cross-validated AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cross-validated PPV of 79% and cross-validated sensitivity of 66% in development data had cross-validated PPV of 78% and cross-validated sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events.

查看译文

关键词

anaphylaxis,electronic health records,health outcome identification,machine learning,supervised,postmarketing product surveillance,predictive modeling

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要