Chrome Extension
WeChat Mini Program
Use on ChatGLM

Adaptive Hierarchical Pooling for Weakly-supervised Sound Event Detection

International Multimedia Conference(2022)

Cited 1|Views42
No score
Abstract
ABSTRACTIn Weakly-supervised Sound Event Detection (WSED), the ground truth of training data contains the presence or absence of each sound event only at the clip-level (i.e., no frame-level annotations). Recently, WSED has been formulated under the multi-instance learning framework, and a critical component within this formulation is the design of the temporal pooling function. In this paper, we propose an adaptive hierarchical pooling (HiPool) for WSED, which combines the advantages of max pooling in audio tagging and weighted average pooling in audio localization through a novel hierarchical structure and learns event-wise optimal pooling functions through continuous relaxation-based joint optimization. Extensive experiments on benchmark datasets show that HiPool outperforms the current pooling methods and greatly improves the performance of WSED. HiPool also has great generality - ready to be plugged into any WSED models.
More
Translated text
Key words
sound event detection,adaptive hierarchical pooling,weakly-supervised
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined