Reducing dimensionality of spectrograms using convolutional autoencoders

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA(2023)

Cited 0|Views1
No score
Abstract
Under the “curse of dimensionality,” distance-based algorithms, such as k-means or Gaussian mixture model clustering, can lose meaning and interpretability in high-dimensional space. Acoustic data, specifically spectrograms, are subject to such limitations due to their high dimensionality: for example, a spectrogram with 100 time- and 100 frequency-bins contains 104 pixels, and its vectorized form constitutes a point in 104-dimensional space. In this talk, we look at four papers that used autoencoding convolutional neural networks to extract salient features of real data. The convolutional autoencoder consists of an encoder which compresses spectrograms into a low-dimensional latent feature space, and a decoder which seeks to reconstruct the original spectrogram from the latent feature space. The error between the original spectrogram and reconstruction is used to train the network. Once trained, the salient features of the data are embedded in the latent space and algorithms can be applied to the lower-dimensional latent space. We demonstrate how lower-dimensional representations result in interpretable clustering of complex physical data, which can contribute to reducing errors in classification and clustering tasks and enable exploratory analysis of large data sets.
More
Translated text
Key words
spectrograms,dimensionality
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined