Semantic-Based Text Document Clustering Using Cognitive Semantic Learning and Graph Theory

2018 IEEE 12th International Conference on Semantic Computing (ICSC)(2018)

引用 11|浏览2
暂无评分
摘要
Semantic-based text document clustering aims to group documents into a set of topic clusters. We propose a new approach for semantically clustering of text documents based on cognitive science and graph theory. We apply a computational cognitive model of semantic association for human semantic memory, known as Incremental Construction of an Associative Network (ICAN). The vector-based model of Latent Semantic Analysis (LSA), has been a leading computational cognitive model for semantic learning and topic modeling, but it has well-known limitations including not considering the original word-order and doing semantic reduction with neither a linguistic nor cognitive basis. These issues have been overcome by the ICAN model. ICAN model is used to generate document-level semantic-graphs. Cognitive and graph-based topic and context identification methods are used for semantic reduction of ICAN graphs. A corpus-graph is generated from ICAN graphs, and then a community-detection graph algorithm is applied for the final step of document clustering. Experiments are conducted on three commonly used datasets. Using the purity and entropy criteria for clustering quality, our results show a notable outperformance over the LSA-based approach.
更多
查看译文
关键词
document clustering,semantic learning,semantic representation,cognitive semantics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要