A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS(2023)

引用 0|浏览1
暂无评分
摘要
Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming more common in educational measurement research as a method for analyzing students' responses to constructed-response items. Two popular topic models are latent semantic analysis (LSA) and latent Dirichlet allocation (LDA). LSA uses linear algebra techniques, whereas LDA uses an assumed statistical model and generative process. In educational measurement, LSA is often used in algorithmic scoring of essays due to its high reliability and agreement with human raters. LDA is often used as a supplemental analysis to gain additional information about students, such as their thinking and reasoning. This article reviews and compares the LSA and LDA topic models. This article also introduces a methodology for comparing the semantic spaces obtained by the two models and uses a simulation study to investigate their similarities.
更多
查看译文
关键词
topic models,latent semantic analysis,latent Dirichlet allocation,constructed-response items,semantic spaces
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要