Evaluating acoustic representations and normalization for rhoticity classification in children with speech sound disorders

JASA EXPRESS LETTERS(2024)

引用 0|浏览5
暂无评分
摘要
The effects of different acoustic representations and normalizations were compared for classifiers predicting perception of children's rhotic versus derhotic //. Formant and Mel frequency cepstral coefficient (MFCC) representations for 350 speakers were z-standardized, either relative to values in the same utterance or age-and-sex data for typical //. Statistical modeling indicated age-and-sex normalization significantly increased classifier performances. Clinically interpretable formants performed similarly to MFCCs and were endorsed for deep neural network engineering, achieving mean test-participant-specific F1-score = 0.81 after personalization and replication (sigma(x) = 0.10, med = 0.83, n = 48). Shapley additive explanations analysis indicated the third formant most influenced fully rhotic predictions.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要