CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines
arxiv(2024)
摘要
Synthetic Electronic Health Records (EHR) have emerged as a pivotal tool in
advancing healthcare applications and machine learning models, particularly for
researchers without direct access to healthcare data. Although existing
methods, like rule-based approaches and generative adversarial networks (GANs),
generate synthetic data that resembles real-world EHR data, these methods often
use a tabular format, disregarding temporal dependencies in patient histories
and limiting data replication. Recently, there has been a growing interest in
leveraging Generative Pre-trained Transformers (GPT) for EHR data. This enables
applications like disease progression analysis, population estimation,
counterfactual reasoning, and synthetic data generation. In this work, we focus
on synthetic data generation and demonstrate the capability of training a GPT
model using a particular patient representation derived from CEHR-BERT,
enabling us to generate patient sequences that can be seamlessly converted to
the Observational Medical Outcomes Partnership (OMOP) data format.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要