ARASID - Artificial Reverberation-Adjusted Indoor Speaker Identification Dealing with Variable Distances.
EWSN(2019)
摘要
Indoor speaker identification systems have been researched for a long time and are widely used in many human interaction acoustic monitoring systems. Many works have focused on improving accuracy in dealing with different realisms, including noise and varying distances from the microphone. However, these works either require significant extra effort such as measuring room types and dimensions, obtaining many speakers’ samples, or requiring expensive hardware such as microphone arrays and complex deployment settings. In this paper, we introduce a complete speaker identification solution using an artificial reverberation generator with different parameters to adjust the original close-distance speech samples so that each speaker has different artificial voice samples. Samples in different environments are not required because these artificial samples are close approximations to different environments. Two kinds of models, GMM-UBM and the i-vector, are evaluated. The models are trained on all samples separately, and testing is done against all in parallel. A score fusing approach with two thresholds, a minimum value and a minimum difference, is applied to the scores in producing the final result. Also, several standard acoustic pre-processing routines, including a voice activity detection algorithm and an overlapped speech remover, are included to make the system fully deployable. Finally, to assess the improvements when applying a reverberation adjustment, we evaluate our system with two literature speech databases, one has 251 people and the other one has four kinds of emotions. Further, we perform an inlab speaking experiment. The evaluation results show our system has more than 90% accuracy in identifying speakers within 6 meters if the emotion is neutral, and a 10% improvement over no reverberation adjustments when speakers have non-neutral emotions.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络