Unsupervised Methods for Audio Classification from Lecture Discussion Recordings

INTERSPEECH(2019)

Cited 5|Views72
No score
Abstract
Time allocated for lecturing and student discussions is an important indicator of classroom quality assessment. Automated classification of lecture and discussion recording segments can serve as an indicator of classroom activity in a flipped classroom setting. Segments of lecture are primarily the speech of the lecturer, while segments of discussion include student speech, silence and noise. Multiple audio recorders simultaneously document all class activities. Recordings are coarsely synchronized to a common start time. We note that the lecturer's speech tends to be common across recordings, but student discussions are captured only in the nearby device(s). Therefore, we window each recording at 0.5 s to 5 s duration and 0.1 s analysis rate. We compute the normalized similarity between a given window and temporally proximate window segments in other recordings. Histogram plot categorizes higher similarity windows as lecture and lower ones as discussion. To improve the classification performance, high energy lecture windows and windows with very high and very low similarity are used to train a supervised model, in order to regenerate the classification results of remaining windows. Experimental results show that binary classification accuracy improves from 96.84% to 97.37%.
More
Translated text
Key words
audio classification, unsupervised classification, flipped classroom
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined