Identifying adverse drug event information in clinical notes with distributional semantic representations of context

Journal of Biomedical Informatics(2015)

引用 102|浏览108
暂无评分
摘要
Display Omitted A corpus of Swedish clinical notes was annotated for adverse drug event information.Detecting adverse drug events in clinical notes can support pharmacovigilance.Modeling context with distributional semantics yielded better predictive models.Distributed word representations allowed more context information to be incorporated.Inter-sentential relations between drugs and disorders/findings are hard to detect. For the purpose of post-marketing drug safety surveillance, which has traditionally relied on the voluntary reporting of individual cases of adverse drug events (ADEs), other sources of information are now being explored, including electronic health records (EHRs), which give us access to enormous amounts of longitudinal observations of the treatment of patients and their drug use. Adverse drug events, which can be encoded in EHRs with certain diagnosis codes, are, however, heavily underreported. It is therefore important to develop capabilities to process, by means of computational methods, the more unstructured EHR data in the form of clinical notes, where clinicians may describe and reason around suspected ADEs. In this study, we report on the creation of an annotated corpus of Swedish health records for the purpose of learning to identify information pertaining to ADEs present in clinical notes. To this end, three key tasks are tackled: recognizing relevant named entities (disorders, symptoms, drugs), labeling attributes of the recognized entities (negation, speculation, temporality), and relationships between them (indication, adverse drug event). For each of the three tasks, leveraging models of distributional semantics - i.e., unsupervised methods that exploit co-occurrence information to model, typically in vector space, the meaning of words - and, in particular, combinations of such models, is shown to improve the predictive performance. The ability to make use of such unsupervised methods is critical when faced with large amounts of sparse and high-dimensional data, especially in domains where annotated resources are scarce.
更多
查看译文
关键词
Adverse drug events,Corpus annotation,Distributional semantics,Electronic health records,Machine learning,Relation extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要