Numerical Age Variations within Clinical Notes: The Potential Impact on De-Identification and Information Extraction

2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W)(2018)

引用 0|浏览70
暂无评分
摘要
Many kinds of numbers and numerical concepts appear frequently in free text clinical notes from electronic health records, including patient ages. The variability in how ages are described may impact the success of information extraction strategies as well as the accuracy of de-identification systems. This brief paper describes an analysis of the variation in how numbers and numerical concepts are represented in clinical notes with respect to ages. We used an inverted index of approximately 100 million notes to obtain the frequency of various permutations of ages, including biologically implausible ages as well as age descriptions that might not be detected by many de-identification systems. Missing such rare, but nevertheless present, variations could result in missed information or even privacy violations.
更多
查看译文
关键词
lexical variation, natural language processing, information retrieval, de-identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要