How to Aggregate Acoustic Delta Features for Deep Speaker Embeddings

ICTC(2020)

引用 1|浏览1
暂无评分
摘要
Speaker verification based on deep speaker embeddings (DSE) network outperformed traditional ivectors systems. Afterward, to improve the performance, various researches have been conducting and data augmentation methods are one of them. In this paper, we focus on acoustic delta features augmentation and their aggregation methods for DSE networks, X-vectors and MobileVoxNet. For CNN-based MobileVoxNet, we re-design the architecture to aggregate delta features in deeper layer with squeeze and excitation (SE) module. Experimental results show that the proposed methods achieve performance improvement compared to not using delta features on the VoxCeleb1 test dataset. We also compare the number of computations and parameters of models to analyze efficiency of the proposed methods.
更多
查看译文
关键词
Speaker verification, Deep learning, Data augmentation, Acoustic delta features, X-Vectors, MobileNet
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要