Multi-Combined Features Text Mining Of Tcm Medical Cases With Crf

2016 8TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY IN MEDICINE AND EDUCATION (ITME)(2016)

引用 2|浏览0
暂无评分
摘要
TCM medical cases in records are free text with much valuable data and clinical terms, how to recognize and extract these clinical terms automatically is a valuable work. TCM medical records obtained from Guangdong Provincial Hospital of Chinese Medicine are segmented to single word and labeled with five labeling features(words in sentence, grammatical property of words, words in clinical dictionary, set phrases acting on neighbor context, and set phrases acting on far distance.), and divided into training sets and testing sets. Training sets are also handled with outputted labeling (labeling of symptoms or signs, TCM diagnosis, TCM syndrome type, Chinese medicines (drug), and Names of TCM prescriptions.). In order to evaluate abilities of labeling features on improving clinical terms recognition with CRF, three indicators (recognition Precision (P), recognition Recall (R) and F-score (F)) are defined, and three comparisons are given: comparisons of individual labeling features, comparisons of combined labeling features, and comparisons of combined features in different diseases. The results show that, "grammatical property of words" is the best labeling features in all individual labeling features. Multi-combined features have higher scores than individual labeling features on improving clinical terms recognition. The combined mode of "grammatical property of words", "words in sentence", and "words in clinical dictionary" may be the most suitable labeling features. Multi-combined labeling features can improve term recognition with CRF model for text mining in TCM medical cases.
更多
查看译文
关键词
Multi-features, Conditional random fields, TCM medical cases, Text mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要