Deep Mutual Attention Network for Acoustic Scene Classification

Wei Xie,Qianhua He,Zitong Yu,Yanxiong Li

Digital signal processing（2022）

Cited 4|Views4

No score

Abstract

Fusion strategies that utilize time-frequency features have achieved superior performance in acoustic scene classification tasks. However, the existing fusion schemes are mainly frameworks that involve different modules for feature learning, fusion, and modeling. These frameworks are prone to introduce artificial interference and thus make it challenging to obtain the system's best performance. In addition, the lack of adequate information interaction between different features in the existing fusion schemes prevents the learned features from achieving the optimal discriminative ability. To tackle these problems, we design a deep mutual attention network based on the principle of receptive field regularization and the mutual attention mechanism. The proposed network can realize the joint learning and complementary enhancement of multiple time-frequency features end-to-end, which improves features' learning efficiency and discriminative ability. Experimental results on six publicly available datasets show that the proposed network outperforms almost all state-of-the-art systems regarding classification accuracy.

Translated text

Key words

Acoustic scene classification,Attention mechanism,Deep mutual attention network,Time-frequency feature,Complementary enhancement

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined