AutoCriteria: a generalizable clinical trial eligibility criteria extraction system powered by large language models

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION(2023)

引用 0|浏览9
暂无评分
摘要
Objectives We aim to build a generalizable information extraction system leveraging large language models to extract granular eligibility criteria information for diverse diseases from free text clinical trial protocol documents. We investigate the model's capability to extract criteria entities along with contextual attributes including values, temporality, and modifiers and present the strengths and limitations of this system.Materials and Methods The clinical trial data were acquired from https://ClinicalTrials.gov/. We developed a system, AutoCriteria, which comprises the following modules: preprocessing, knowledge ingestion, prompt modeling based on GPT, postprocessing, and interim evaluation. The final system evaluation was performed, both quantitatively and qualitatively, on 180 manually annotated trials encompassing 9 diseases.Results AutoCriteria achieves an overall F1 score of 89.42 across all 9 diseases in extracting the criteria entities, with the highest being 95.44 for nonalcoholic steatohepatitis and the lowest of 84.10 for breast cancer. Its overall accuracy is 78.95% in identifying all contextual information across all diseases. Our thematic analysis indicated accurate logic interpretation of criteria as one of the strengths and overlooking/neglecting the main criteria as one of the weaknesses of AutoCriteria.Discussion AutoCriteria demonstrates strong potential to extract granular eligibility criteria information from trial documents without requiring manual annotations. The prompts developed for AutoCriteria generalize well across different disease areas. Our evaluation suggests that the system handles complex scenarios including multiple arm conditions and logics.Conclusion AutoCriteria currently encompasses a diverse range of diseases and holds potential to extend to more in the future. This signifies a generalizable and scalable solution, poised to address the complexities of clinical trial application in real-world settings.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要