EHR-HGCN: An Enhanced Hybrid Approach for Text Classification Using Heterogeneous Graph Convolutional Networks in Electronic Health Records

Guishen Wang, Xiaoxue Lou, Fang Guo, Devin Kwok,Chen Cao

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS(2024)

引用 0|浏览0
暂无评分
摘要
Text classification is a central part of natural language processing, with important applications in understanding the knowledge behind biomedical texts including electronic health records (EHR). In this article, we propose a novel heterogeneous graph convolutional network method for classifying EHR texts. Our method, called EHR-HGCN, is able to combine context-sensitive word and sentence embeddings with structural sentence-level and word-level relation information to perform text classification. EHR-HGCN reframes EHR text classification as a graph classification task to better capture structural information about the document using a heterogeneous graph. To mine contextual information from a document, EHR-HGCN first applies a bidirectional recurrent neural network (BiRNN) on word embeddings obtained via Global Vectors for word representation (GloVe) to obtain context-sensitive word-level and sentence-level embeddings. To mine structural relationships from the document, EHR-HGCN then constructs a heterogeneous graph over the word and sentence embeddings, where sentence-word and word-word relationships are represented by graph edges. Finally, a heterogeneous graph convolutional neural network is used to classify documents by their graph representation. We evaluate EHR-HGCN on a variety of standard text classification benchmarks and find that EHR-HGCN has higher accuracy and F1-score than other representative machine learning and deep learning methods. We also apply EHR-HGCN to the MedLit benchmark and find it performs with high accuracy and F1-score on the task of section classification in EHR texts. Our ablation experiments show that the heterogeneous graph construction and heterogeneous graph convolutional network are critical to the performance of EHR-HGCN.
更多
查看译文
关键词
Text classification,heterogeneous graph convolutional network,graph classification,electronic health records
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要