Jhu-Hltcoe System For The Voxsrc Speaker Recognition Challenge
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING(2020)
Abstract
The VoxSRC speaker recognition challenge comprises data obtained from YouTube videos of celebrity interviews in a wide range of recording environments. The challenge provides FIXED and OPEN training conditions to allow cross-system comparisons and to characterize the effects of additional amounts of training data on system performance. This paper describes our submission to this challenge where we have explored x-vector extractor topologies, classification head alternatives, data augmentation, and angular margin penalty. Our final entry to the FIXED condition (which achieved 2nd place) is the score average of 4 diverse systems. We find that this system outperforms a large single DNN with similar number of parameters.
MoreTranslated text
Key words
X-vectors, speaker recognition, VoxSRC
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined