Phenotyping of clinical trial eligibility text from cancer studies into computable criteria in electronic health records.

JOURNAL OF CLINICAL ONCOLOGY(2021)

引用 0|浏览12
暂无评分
摘要
6592 Background: Clinical trial phenotyping is the process of extracting clinical features and patient characteristics from eligibility criteria. Phenotyping is a crucial step that precedes automated cohort identification from patient electronic health records (EHRs) against trial criteria. We establish a clinical trial phenotyping pipeline to transform clinical trial eligibility criteria into computable criteria and enable high throughput cohort selection in EHRs. Methods: Formalized clinical trial criteria attributes were acquired from a natural-language processing (NLP)-assisted approach. We implemented a clinical trial phenotyping pipeline that included three components: First, a rule-based knowledge engineering component was introduced to annotate the trial attributes into a computable and customizable granularity from EHRs. The second component involved normalizing annotated attributes using standard terminologies and pre-defined reference tables. Third, a knowledge base of computable criteria attributes was built to match patients to clinical trials. We evaluated the pipeline performance by independent manual review. The inter-rater agreement of the annotation was measured on a random sample of the knowledge base. The accuracy of the pipeline was evaluated on a subset of randomly selected matched patients for a subset of randomly selected attributes. Results: Our pipeline phenotyped 2954 clinical trials from five cancer types including Non-Small Cell Lung Cancer, Small Cell Lung Cancer, Prostate Cancer, Breast Cancer, and Multiple Myeloma. We built a knowledge base of 256 computable attributes that included comorbidities, comorbidity-related treatment, previous lines of therapy, laboratory tests, and performance such as ECOG and Karnofsky score. Among 256 attributes, 132 attributes were encoded using standard terminologies and 124 attributes were normalized to customized concepts. The inter-rater agreement of the annotation measured by Cohen’s Kappa coefficient was 0.83. We applied the knowledge base to our EHRs and efficiently identified 33258 potential subjects for cancer clinical trials. Our evaluation on the patient matching indicated the F1 score was 0.94. Conclusions: We established a clinical trial phenotyping pipeline and built a knowledge base of computable criteria attributes that enabled efficient screening of EHRs for patients meeting clinical trial eligibility criteria, providing an automated way to efficiently and accurately identify clinical trial cohorts. The application of this knowledge base to patient matching from EHR data across different institutes demonstrates its generalization capability. Taken together, this knowledge base will be particularly valuable in computer-assisted clinical trial subject selection and clinical trial protocol design in cancer studies based on real-world evidence.
更多
查看译文
关键词
clinical trial eligibility text,electronic health records,clinical trial,cancer studies
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要