Explain-and-Test: An Interactive Machine Learning Framework for Exploring Text Embeddings

Shivam Raval, Carolyn Wang,Fernanda Viegas,Martin Wattenberg

2023 IEEE VISUALIZATION AND VISUAL ANALYTICS, VIS(2023)

引用 0|浏览9
暂无评分
摘要
Text embeddings-mappings of collections of text to points in high-dimensional space-are a common object of analysis. A classic method to visualize these embeddings is to create a nonlinear projection to two dimensions and look for clusters and other structures in the resulting map. Explaining why certain texts cluster together, however, can be difficult. In this paper, we introduce a human-in-the-loop framework for applying machine learning (ML) to this challenge. The framework has two stages: (1) explain, in which we use ML to produce a description of a pattern; and (2) test, in which the user can verify the explanation by entering new text that fits the pattern, and sees where it appears on the map. If the new text is mapped to the original cluster, that is evidence in favor of the ML-generated explanation. We illustrate this process with a visualization application that provides two kinds of explanations: Natural Language Explanations and Contrastive PhraseClouds. Scenarios on exploring academic papers and literary work showcase the benefit of our workflow in discovering related topics and analyzing thematic differences in text.
更多
查看译文
关键词
Text Visualization,Dimensionality Reduction,Clustering,Large Language Models,Explanation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要