Efficient Stuttering Event Detection Using Siamese Networks

Payal Mohapatra,Bashima Islam,Md Tamzeed Islam,Ruochen Jiao,Qi Zhu

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)（2023）

引用 0|浏览8

暂无评分

摘要

Speech disfluency research is pivotal to accommodating atypical speakers in mainstream conversational technology. However, the lack of publicly available labeled and unlabeled datasets is a significant bottleneck to such research. While many works use pseudo dysfluency data with proxy labels and formulate a self-supervised task, we see merit in using real-world data. In this work, we consolidate the corpora of publicly available speech disfluency datasets with and without labels and propose DisfluentSiam – an efficient siamese network-based small-scale pretraining pipeline using task-specific data from multiple domains with only 10M trainable parameters. We show that with DisfluentSiam, we achieve an average of 15% boost in performance across five types of dysfluency event detection compared to direct wav2vec 2.0 embeddings. In particular, with only 4-5 mins of labeled data for fine-tuning, the DisfluentSiam demonstrates the advantage of task-specific pretraining with up to 25% higher accuracy.

查看译文

关键词

Dysfluency,Self-supervised Learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要