A Geometry-Driven Longitudal Topic Model

Harvard Data Science Review(2021)

引用 0|浏览4
暂无评分
摘要
hero@umich.edu Abstract. A simple and scalable framework for longitudinal analysis of Twitter data is developed that combines latent topic models with computational geometric methods. Dimensionality reduction tools from computational geometry are applied to learn the intrinsic manifold on which the latent, temporal topics reside. Then shortest path distances on the manifold are used to link together these topics. The proposed framework permits visualization of the low-dimensional embedding which provides clear interpretation of the complex, high-dimensional trajectories that may exist among latent topics. Practical application of the proposed framework is demonstrated through its ability to capture and effectively visualize natural progression of latent COVID-19 related topics learned from Twitter data. Interpretability of the trajectories is achieved by comparing to real-world events. In addition, the framework permits study of spatial variation in Twitter behavior for learned topics. The analysis demonstrates that the proposed framework is able to capture granular-level impact of COVID-19 on public discussions. We end by arguing that Twitter data, when analyzed within the proposed framework, can serve as a valuable supplementary data stream for COVID-related studies.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要