Detection of Vowel Errors in Children's Speech using Synthetic Phonetic Transcripts.

2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)(2023)

引用 0|浏览3
暂无评分
摘要
The analysis of phonological processes is crucial in evaluating speech development disorders in children, but encounters challenges due to limited children audio data. This work focuses on automatic vowel error detection using a two-stage pipeline. The first stage uses a fine-tuned cross-lingual phone recognizer (wav2vec 2.0) to extract phone sequences from audio. The second stage employs a language model (BERT) for classification from a phone sequence, entirely trained on synthetic transcripts, to counteract the very broad range of potential mistakes. We evaluate the system on nonword audio recordings recited by preschool children from a speech development test. The results show that the classifier trained on synthetic data performs well, but its efficacy relies on the quality of the phone recognizer. The best classifier achieves an 94.7% F1 score when evaluated against phonetic ground truths, whereas the F1 score is 76.2% when using automatically recognized phone sequences.
更多
查看译文
关键词
children’s speech,vowel errors,nonwords
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要