Chrome Extension
WeChat Mini Program
Use on ChatGLM

Neural-Network-Based Direction-of-Arrival Estimation for Reverberant Speech - The Importance of Energetic, Temporal, and Spatial Information

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING(2024)

Cited 0|Views5
No score
Abstract
Direction-of-arrival (DOA) estimation is a fundamental task in audio signal processing that becomes difficult in real-world environments due to the presence of reverberation. To address this difficulty, Direct-Path Dominance (DPD) tests have been proposed as an effective approach for detecting time-frequency (TF) bins dominated by direct sound, which contain accurate DOA information. These have been found to be particularly efficient when working with spherical arrays. While methods based on neural networks (NNs) have been developed to estimate the DOA, they have limitations such as the need for a large training database, and often understanding of the system's operation is lacking. This work proposes two novel DPD-test methods based on a model-based deep learning approach that combines the original DPD-test model with a data-driven system. Thus, it is possible to preserve the robustness of the original DPD-test across acoustic environments, while using a data-driven approach to better extract useful information about the direct sound, thereby enhancing the original method's performance. In particular, the paper investigates how energetic, temporal and spatial information contribute to the identification of TF-bins dominated by the direct signal. The proposed methods are trained on simulated data of a single sound source in a room, and evaluated on simulated and real data. The results show that energetic and temporal information provide new information about direct sound, which has not been considered in previous works and can improve its performance.
More
Translated text
Key words
Direction-of-arrival estimation,Estimation,Training,Arrays,Time-frequency analysis,Reverberation,Deep learning,Speaker localization,direction-of-arrival (DOA),spherical arrays,machine learning,deep learning,neural network (NN),multilayer perceptron (MLP),long short-term memory (LSTM)
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined