Aphasia Detection for Cantonese-Speaking and Mandarin-Speaking Patients Using Pre-Trained Language Models
2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP)(2022)
摘要
Automatic analysis of aphasic speech based on speech technology has been extensively investigated in recent years, but there has been a few studies on Chinese languages. In this paper, we focus on automatic aphasia detection for Cantonese-and Mandarin-speaking patients using state-of-the-art pre-trained language models that support both traditional and simplified Chinese. Given speech transcriptions of subjects, pre-trained language models are used in two ways: 1) pre-trained language model derived embeddings followed by a classifier; 2) pre-trained language model fine-tuned for aphasia detection task. Both approaches are demonstrated to outperform baseline models using acoustic features and static word embeddings. The best accuracy is obtained with fine-tuned BERT models, achieving 0.98 and 0.94 for Cantonese-speaking and Mandarin-speaking subjects respectively. We also investigate the feasibility of applying the cross-lingual pre-trained language model fine-tuned by aphasia detection task for Cantonese-speaking subjects to Mandarin-speaking subjects with limited data. The promising results will hopefully make it possible to perform detection on those low-resource pathological speech which is difficult to implement a specific detection system.
更多查看译文
关键词
Aphasia detection,Pre-trained language model,Cantonese,Mandarin
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要