Teacher-Student Training For Text-Independent Speaker Recognition

2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018)(2018)

引用 6|浏览51
暂无评分
摘要
This paper investigates text-independent speaker recognition using neural embedding extractors based on the time-delay neural network. Our primary focus is to explore the teacher-student (TS) training framework for knowledge distillation in a text-independent (TI) speaker recognition task. We report the results on both proprietary and public benchmarks, obtaining competitive results with 88-93% smaller models. Particularly, in clean testing conditions, we find TS training on neural-based TI systems achieved same or better performance than the i-vector based counterparts. Neural embeddings are less prone to short segment issues, and offer better performance particularly in the high-recall setting. They can also provide some additional insights about speakers, such as gender or how difficult a given speaker can be for recognition.
更多
查看译文
关键词
Speaker Recognition, Knowledge Distillation, Teacher-Student training
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要