Low-Complexity Acoustic Scene Classification Using Time Frequency Separable Convolution

Duc H. Phan,Douglas L. Jones

ELECTRONICS(2022)

引用 0|浏览3
暂无评分
摘要
Replacing 2D-convolution operations by depth-wise separable time and frequency convolutions greatly reduces the number of parameters while maintaining nearly equivalent performances in the context of acoustic scene classification. In our experiments, the models' sizes can be reduced by 6 to 14 times with similar performances. For a 3-class audio classification, replacing 2D-convolution in a CNN model gives roughly a 2% increase in accuracy. In a 10-class audio classification with multiple recording devices, replacing 2D-convolution in Resnet only reduces around 1.5% of the accuracy.
更多
查看译文
关键词
low complexity audio network, acoustic scene classification, depth-wise separable convolutions, detection and classification of acoustic scenes and events
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要