An i-vector based approach for audio scene detection

semanticscholar(2013)

引用 4|浏览0
暂无评分
摘要
The IEEE-ASSP Scene Classification challenge on user-generated content (UGC) aims to classify an audio recording that belongs to a specific scene such as busystreet, office or supermarket. The difficulty of scene content analysis on UGC lies in the lack of structure and acoustic variability of the data. The i-vector system is state-ofthe-art in Speaker Verification and Scene Detection, and is outperforming conventional Gaussian Mixture Model (GMM)-based approaches. The system compensates for undesired acoustic variability and extracts information from the acoustic environment, making it a meaningful choice for detection on UGC. This paper reports our results in the challenge by using a hand-tuned i-vector system and MFCC features. Compared to the MFCC+GMM baseline system, our system increased the classification accuracy by 26.4% to about 65.8%. We discuss our approach and highlight parameters in our system that showed to significantly improved our classification accuracy.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要