What are People Talking about in High-Dimension Data Visualization? LLM-supported Analysis of Domain Literature

crossref(2024)

引用 0|浏览3
暂无评分
摘要
Visualizing high-dimensional (HD) data is a common yet challenging task in various application domains. Previous surveys on HD visualization are conducted either in the visualization community or through the interview of a limited number of domain users. A comprehensive understanding of the usage of HD visualizations in the wild is missing. To fill this gap, we analyzed more than 1,000 papers from one representative domain (single-cell transcriptomics) that extensively employed HD data visualizations. To effectively analyze this extensive corpus filled with highly domain-specific terminologies, we propose a pipeline to effectively collaborate with an LLM annotator on interpreting and summarizing the usage of HD visualizations in the collected papers.This pipeline includes machine learning techniques for figure detection, traditional NLP methods for text cleaning, and LLM prompt engineering for nuanced interpretation.With this pipeline, we categorized HD visualization based on how users referred to and mentioned these visualizations in their papers.We then discussed representative visualizations for each category, as well as current practices and potential wrong uses.These analyses can assist the visualization community in designing and evaluating future HD visualizations.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要