Automatic Mixing for Immersive Teleconferencing Systems

semanticscholar(2016)

引用 0|浏览0
暂无评分
摘要
The aim of immersive teleconferencing is to convey a realistic sound field impression to a remote participant. To this end, the spatial distribution of talkers as well as room information needs to be captured by the nearend system and accurately reproduced on the far-end. As illustrated in Figure 1, we consider a setup where high speech quality is obtained by means of several close microphones (spot microphones) and spatial information is captured with a small circular or spherical microphone array in the center of the acoustic scene. The proposed automatic mixing system robustly estimates the directions of multiple active talkers and mixes the closemicrophone signals with the room information gathered by the central microphone array, whereas the spot microphones need not be synchronized with the microphone array and their positions are assumed to be unknown. Furthermore, we propose a novel automatic gain control method that keeps natural speech dynamics while equalizing speech level fluctuations due to unintentional changes of the talker-microphone distance. To allow for maximal flexibility concerning the reproduction system on the far-end (e.g. different loudspeaker setups or binaural reproduction for headphones), the sound field is encoded in higher-order Ambisonics. Listening experiments of our concluding evaluation indicate the optimal settings for recorded multi-talker scenarios using both headphoneand loudspeaker-based reproduction. With
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要