The Application of Learnable STRF Kernels to the 2021 Fearless Steps Phase-03 SAD Challenge.

Interspeech(2021)

引用 2|浏览18
暂无评分
摘要
We describe a deep-learning-based system developed for the Fearless Steps Phase-03 Speech Activity Detection (SAD) challenge. The system includes both learnable spectro-temporal receptive fields (STRFs) and unconstrained 2-dimensional convolutional kernels in the first layer. Experiments show that the inclusion of learnable STRFs in the first layer increases the system's robustness to additive noise. Additionally, we found that utilizing SpecAugment during training improves generalization on unseen data. By incorporating these enhancements and others our system achieved the best score in the official SAD challenge.
更多
查看译文
关键词
speech activity detection,spectro-temporal receptive field,gabor filters
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要