Testing calibration of phenotyping models using positive-only electronic health record data

Lingjiao Zhang,Yanyuan Ma,Daniel Herman,Jinbo Chen

BIOSTATISTICS（2022）

Cited 1|Views8

No score

Abstract

Validation of phenotyping models using Electronic Health Records (EHRs) data conventionally requires gold-standard case and control labels. The labeling process requires clinical experts to retrospectively review patients' medical charts, therefore is labor intensive and time consuming. For some disease conditions, it is prohibitive to identify the gold-standard controls because routine clinical assessments are performed for selective patients who are deemed to possibly have the condition. To build a model for phenotyping patients in EHRs, the most readily accessible data are often for a cohort consisting of a set of gold-standard cases and a large number of unlabeled patients. Hereby, we propose methods for assessing model calibration and discrimination using such "positive-only" EHR data that does not require gold-standard controls, provided that the labeled cases are representative of all cases. For model calibration, we propose a novel statistic that aggregates differences between model-free and model-based estimated numbers of cases across risk subgroups, which asymptotically follows a Chi-squared distribution. We additionally demonstrate that the calibration slope can also be estimated using such "positive-only" data. We propose consistent estimators for discrimination measures and derive their large sample properties. We demonstrate performances of the proposed methods through extensive simulation studies and apply them to Penn Medicine EHRs to validate two preliminary models for predicting the risk of primary aldosteronism.

Translated text

Key words

Calibration, Discrimination, Electronic health records, Positive-only data, Phenotyping

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined