Ontologies Improve Text Document Clustering

ICDM(2003)

引用 853|浏览421
暂无评分
摘要
Text document clustering plays an important role in providingintuitive navigation and browsing mechanisms by organizinglarge sets of documents into a small number ofmeaningful clusters. The bag of words representation usedfor these clustering methods is often unsatisfactory as it ignoresrelationships between important terms that do not co-occurliterally. In order to deal with the problem, we integratecore ontologies as background knowledge into theprocess of clustering text documents. Our experimentalevaluations compare clustering techniques based on pre-categorizationsof texts from Reuters newsfeeds and on asmaller domain of an eLearning course about Java. In theexperiments, improvements of results by background knowledgecompared to a baseline without background knowledgecan be shown in many interesting combinations.
更多
查看译文
关键词
ontologies improve text document,important role,reuters newsfeeds,browsing mechanism,text document clustering,asmaller domain,important term,clustering method,clustering text document,pre-categorizationsof text,data mining,distance learning,bag of words,document clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要