Quantitative Evaluation System for Online Meetings Based on Multimodal Microbehavior Analysis

SENSORS AND MATERIALS(2022)

引用 2|浏览7
暂无评分
摘要
Maintaining a positive interaction is the key to a healthy and efficient meeting. Aiming to improve the quality of online meetings, we present an end-to-end neural-network-based system, named MeetingPipe, which is capable of quantitative microbehavior detection (smiling, nodding, and speaking) from recorded meeting videos. For smile detection, we build a neural network framework that consists of an 18-layer residual network for feature representation, and a self -attention layer to explore the correlation between each receptive field. To perform nodding detection, we obtain head rotation data as the key nodding feature. Then we use a gated recurrent unit followed by a squeeze-and-excitation mechanism to capture the temporal information of nodding patterns from head pitch angles. In addition, we utilize TalkNet, an active speaker detection model, which can effectively recognize active speakers from videos. Experiments demonstrate that with K-fold cross validation, the F1 scores of the smile, nodding, and speaking detection are 97.34, 81.26, and 94.90%, respectively. The processing can be accelerated with multiple GPUs due to the multithread design. The code is available at https://github.com/ humanophilic/MeetingPipe.
更多
查看译文
关键词
online meeting, neural network, smile detection, head pose estimation, active speaker detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要