Consistency Based Unsupervised Self-training For ASR Personalisation
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)(2024)
摘要
On-device Automatic Speech Recognition (ASR) models trained on speech data of
a large population might underperform for individuals unseen during training.
This is due to a domain shift between user data and the original training data,
differed by user's speaking characteristics and environmental acoustic
conditions. ASR personalisation is a solution that aims to exploit user data to
improve model robustness. The majority of ASR personalisation methods assume
labelled user data for supervision. Personalisation without any labelled data
is challenging due to limited data size and poor quality of recorded audio
samples. This work addresses unsupervised personalisation by developing a novel
consistency based training method via pseudo-labelling. Our method achieves a
relative Word Error Rate Reduction (WERR) of 17.3
and 8.1
current state-of-the art methods.
更多查看译文
关键词
speech recognition,unsupervised,speaker adaptation,personalisation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要