Deep learning facilitates rapid cohort identification using human and veterinary clinical narratives

bioRxiv(2019)

引用 0|浏览79
暂无评分
摘要
Background: In public health research, there is currently a need to close the gap between care delivery and cohort identification. We need dedicated tagging staff to allocate a considerable amount of effort to assigning clinical codes after reading patient summaries. Machine learning automation can facilitate the classification of these clinical narratives, but sufficient availability of electronic medical records is still a bottleneck. Veterinary medical records represent a largely untapped data source that could be used to benefit both human and non-human patients. Very few approaches utilizing veterinary data sources currently exist. Methods: In this retrospective cross-sectional and chart review study, we trained separate long short-term memory (LSTM) Recurrent Neural Networks (RNNs) on 52,722 human records and 89,591 veterinary records, tested the models9 efficacy in a standard train-test split setup, and probed the portability of these models across species domains. We trained versions of our models using first the free-text clinical narratives, and then only using extracted clinically relevant terms from MetaMap Lite, a natural language processing tool intended for this purpose. Findings: We show that our LSTM approach correctly classifies across top-level codes in the veterinary records (F1 score =0.83), and identifies top-level neoplasia records in veterinary records (F1 score = 0.93). The model trained with veterinary data can be ported over to identify neoplasia records in the human records (F1 score = 0.70). Interpretation: Our findings suggest that free-text clinical narratives can be used to learn classification models that allow the rapid identification of patient cohorts. Ultimately, this effort can lead to new insights that can address emerging public health concerns. Digitization of health information will continue to be a reality in both human and veterinary data; our approach serves as first proof-of-concept regarding how these two domains can learn from, and inform, one another.
更多
查看译文
关键词
Clinical Coding,Electronic Health Records,Machine Learning,Neural Networks (Computer),One Health,Public Health Informatics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要