Unsupervised Methods for Audio Classification from Lecture Discussion Recordings

INTERSPEECH(2019)

引用 5|浏览66
暂无评分
摘要
Time allocated for lecturing and student discussions is an important indicator of classroom quality assessment. Automated classification of lecture and discussion recording segments can serve as an indicator of classroom activity in a flipped classroom setting. Segments of lecture are primarily the speech of the lecturer, while segments of discussion include student speech, silence and noise. Multiple audio recorders simultaneously document all class activities. Recordings are coarsely synchronized to a common start time. We note that the lecturer's speech tends to be common across recordings, but student discussions are captured only in the nearby device(s). Therefore, we window each recording at 0.5 s to 5 s duration and 0.1 s analysis rate. We compute the normalized similarity between a given window and temporally proximate window segments in other recordings. Histogram plot categorizes higher similarity windows as lecture and lower ones as discussion. To improve the classification performance, high energy lecture windows and windows with very high and very low similarity are used to train a supervised model, in order to regenerate the classification results of remaining windows. Experimental results show that binary classification accuracy improves from 96.84% to 97.37%.
更多
查看译文
关键词
audio classification, unsupervised classification, flipped classroom
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要