Visually Informed Multi-Pitch Analysis Of String Ensembles
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2017)
Abstract
Multi-pitch analysis of polyphonic music requires estimating concurrent pitches (estimation) and organizing them into temporal streams according to their sound sources (streaming). This is challenging for approaches based on audio alone due to the polyphonic nature of the audio signals. Video of the performance, when available, can be useful to alleviate some of the difficulties. In this paper, we propose to detect the play/non-play (P/NP) activities from musical performance videos using optical flow analysis to help with audio-based multi-pitch analysis. Specifically, the detected P/NP activity provides a more accurate estimate of the instantaneous polyphony (i.e., the number of pitches at a time instant), and also helps with assigning pitch estimates to only active sound sources. As the first attempt towards audio-visual multi-pitch analysis of multiinstrument musical performances, we demonstrate the concept on 11 string ensembles. Experiments show a high overall P/NP detection accuracy of 85.3%, and a statistically significant improvement on both the multi-pitch estimation and streaming accuracy, under paired t-tests at a significance level of 0 : 0 1 in most cases.
MoreTranslated text
Key words
Multi-pitch estimation, streaming, audio-visual analysis, source separation, constrained clustering, SVM classifier
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined