Learning to identify core term of knowledge unit from short text

FSKD(2012)

引用 1|浏览27
暂无评分
摘要
We present a new task of identifying core term (CT) of knowledge unit (KU) from text for knowledge management and service. Two kinds of approaches, including binary classification using naïve bayesian, decision tree, logistic regression and SVM, as well as competition learning based on pairwise classification, are investigated for this specific task, combined with presented rich feature set from position, token features to statistic and linguistic features. Experimental results show that simple classification method can effectively address this task with desirable performance at 82.7% KU accuracy. However, since the recognition of core term relies on the KU as an integer and all its inner terms, competition learning based on pairwise classification achieves better result at 89.6%. We also empirically show that all of the presented types of features are useful for our task, and the combination of position and linguistic features is essential for information extraction on short text.
更多
查看译文
关键词
knowledge management and service,binary classification,logistic regression,classification,text mining,pairwise classification,bayes methods,core term recognition,core term identification,regression analysis,pattern classification,token feature,knowledge management,svm,knowledge unit,ku,statistic features,information extraction,naïve bayesian,position features,linguistic features,knowledge discovery,short text,text analysis,decision tree,decision trees,unsupervised learning,support vector machines,competition learning,competitive learning,pragmatics,feature extraction,vectors,data mining,measurement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要