Topicview: Visual Analysis Of Topic Models And Their Impact On Document Clustering

INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS(2013)

引用 2|浏览74
暂无评分
摘要
We present a new approach for analyzing topic models using visual analytics. We have developed Topic View, an application for visually comparing and exploring multiple models of text corpora, as a prototype for this type of analysis tool. Topic View uses multiple linked views to visually analyze conceptual and topical content, document relationships identified by models, and the impact of models on the results of document clustering. As case studies, we examine models created using two standard approaches: Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Conceptual content is compared through the combination of (i) a bipartite graph matching LSA concepts with LDA topics based on the cosine similarities of model factors and (ii) a table containing the terms for each LSA concept and LDA topic listed in decreasing order of importance. Document relationships are examined through the combination of (i) side-by-side document similarity graphs, (ii) a table listing the weights for each document's contribution to each concept/topic, and (iii) a full text reader for documents selected in either of the graphs or the table. The impact of LSA and LDA models on document clustering applications is explored through similar means, using proximities between documents and cluster exemplars for graph layout edge weighting and table entries. We demonstrate the utility of Topic View's visual approach to model assessment by comparing LSA and LDA models of several example corpora.
更多
查看译文
关键词
Text analysis, visual model analysis, latent semantic analysis, latent dirichlet allocation, clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要