Generating Representative Phrase Sets for Text Entry Experiments by GA-Based Text Corpora Sampling

MATHEMATICS(2023)

引用 0|浏览1
暂无评分
摘要
In the field of human-computer interaction (HCI), text entry methods can be evaluated through controlled user experiments or predictive modeling techniques. While the modeling approach requires a language model, the empirical approach necessitates representative text phrases for the experimental stimuli. In this context, finding a phrase set with the best language representativeness belongs to the class of optimization problems in which a solution is sought in a large search space. We propose a genetic algorithm (GA)-based method for extracting a target phrase set from the available text corpus, optimizing its language representativeness. Kullback-Leibler divergence is utilized to evaluate candidates, considering the digram probability distributions of both the source corpus and the target sample. The proposed method is highly customizable, outperforms typical random sampling, and exhibits language independence. The representative phrase sets generated by the proposed solution facilitate a more valid comparison of the results from different text entry studies. The open source implementation enables the easy customization of the GA-based sampling method, promotes its immediate utilization, and facilitates the reproducibility of this study. In addition, we provide heuristic guidelines for preparing the text entry experiments, which consider the experiment's intended design and the phrase set to be generated with the proposed solution.
更多
查看译文
关键词
text entry,phrase sets,text corpus sampling,genetic algorithm,Kullback-Leibler divergence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要