Robust Acoustic Event Recognition Using Avmd-Pwvd Time-Frequency Image

Yanhua Zhang,Ke Zhang,Jingyu Wang,Yu Su

APPLIED ACOUSTICS(2021)

引用 4|浏览2
暂无评分
摘要
Environmental sound feature extraction and classification are important signal analysis tools in many applications, such as audio surveillance, multimedia retrieval, and auditory source identification. However, the non-stationarity and discontinuity of environmental signals make quantification and classification a formidable challenge. Hence, researchers proposed to use the time-frequency image representation to quantify these non-stationarity, resulting in higher classification accuracy. In this paper, a time-frequency representation method is proposed to represent environmental sound signals. Our approach consists of three stages: Firstly, we propose an adaptive variational modal decomposition (AVMD) based on central angular frequency difference to decompose environmental sounds into a series of modes. Secondly, we use the pseudo Wigner-Vile distribution (PWVD) to accurately obtain the instantaneous frequency characteristics of mode signals. Thirdly, time-frequency images of sound signals are obtained by combining the mode signals with PWVD. Finally, we put the time-frequency image into a convolutional neural network (CNN) for classification. The method is tested on the Real World Computing Partnership (RWCP) Sound Scene Database of 50 classes in mismatched conditions. Results show that our method is robust to noise and achieves the best average recognition accuracy compared with several state-of-art methods under clean and various noisy conditions. (C) 2021 Elsevier Ltd. All rights reserved.
更多
查看译文
关键词
Time-frequency image, Acoustic event recognition, Pseudo Wigner-Vile distribution, Variational modal decomposition, Pseudo-color, Convolutional neural network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要