SOUND EVENT LOCALIZATION AND DETECTION WITH VARIOUS LOSS FUNCTIONS Technical Report

semanticscholar(2020)

引用 4|浏览0
暂无评分
摘要
This technical report presents our system submitted to DCASE 2020 task 3. The goal of DCASE Task 3 is to detect a sound event and its location when a polyphonic sound event moves dynamically. We focus on designing loss functions to overcome the characteristics of the sub-task and imbalanced dataset. Temporal masking loss is used to overcome imbalance from zero labels of the silence frame. Soft floss is used for overcoming imbalance instances between class labels. A periodic loss function is proposed for regression that infers the periodic label in the direction of arrival estimation. Also, we take a feature pyramid network based network to overcome the information leakage occurred by the pooling layer in the CRNN.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要