Chrome Extension
WeChat Mini Program
Use on ChatGLM

Soft Dynamic Time Warping with Variable Step Weights.

Johannes Zeitler, Michael Krause, Meinard Müller

IEEE International Conference on Acoustics, Speech, and Signal Processing(2024)

Cited 0|Views6
No score
Abstract
In computer vision and audio processing, soft dynamic time warping (SDTW) techniques have been used as a differentiable loss function to train deep neural networks (DNNs) on weakly aligned data. In existing SDTW algorithms, the horizontal, vertical, and diagonal alignment steps all have the same weight, i.e., they contribute equally to the alignment cost. This equal weighting scheme for all step sizes can lead to degenerated alignments by, e.g., aligning most predictions to a single target frame in the early stages of training. Problems with equal step weights are known from classical DTW and have been addressed by assigning different weights to different step sizes. In this paper, we extend SDTW to allow for variable step weights and provide efficient dynamic programming algorithms for the forward and backward passes. As an example, we demonstrate the potential of the method on the task of training a DNN for pitch class estimation from music recordings, using step weight parameters that reduce the influence of outliers in repetitions of the same target frame.
More
Translated text
Key words
soft dynamic time warping,step weights,pitch class estimation,music processing,music information retrieval
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined