Distribution Mismatch Correction for Acoustic Scene Classification

Speech Communication; 15th ITG Conference(2023)

引用 0|浏览0
暂无评分
摘要
While deep learning methods have shown immense benefits for Acoustic Scene Classification (ASC) tasks in terms of performance, they also introduce new challenges as these methods are prone to suffer from large performance degradation for out of distribution data. To build robust ASC models that can achieve reliable performance across multiple recording devices, the architecture has to be able to quickly adapt to changing input and activation distributions. We present ASCMobConvNet, a CNN architecture based on Mobile Inverted Bottleneck Convolutions. In order to better adapt to domain shifts and the resulting change in activation distributions, it uses sub-spectral normalization layers in combination with residual normalization instead of batch normalization layers. Furthermore, the model corrects non-parametric mismatches in the activation distributions through the integration of Wasserstein distribution correction layers. Using our proposed architecture we are able to achieve an test accuracy of 68:10% on the TAU Urban Acoustic Scenes 2020 Mobile development dataset. Using Wasserstein distribution correction layers we can further improve the accuracy by 0:68 %.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要