A Comprehensive Rubric for Annotating Pathological Speech
arxiv(2024)
摘要
Rubrics are a commonly used tool for labeling voice corpora in speech quality
assessment, although their application in the context of pathological speech
remains relatively limited. In this study, we introduce a comprehensive rubric
based on various dimensions of speech quality, including phonetics, fluency,
and prosody. The objective is to establish standardized criteria for
identifying errors within the speech of individuals with Down syndrome, thereby
enabling the development of automated assessment systems. To achieve this
objective, we utilized the Prautocal corpus. To assess the quality of
annotations using our rubric, two experiments were conducted, focusing on
phonetics and fluency. For phonetic evaluation, we employed the Goodness of
Pronunciation (GoP) metric, utilizing automatic segmentation systems and
correlating the results with evaluations conducted by a specialized speech
therapist. While the obtained correlation values were not notably high, a
positive trend was observed. In terms of fluency assessment, deep learning
models like wav2vec were used to extract audio features, and we employed an SVM
classifier trained on a corpus focused on identifying fluency issues to
categorize Prautocal corpus samples. The outcomes highlight the complexities of
evaluating such phenomena, with variability depending on the specific type of
disfluency detected.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要