WI-FI based Indoor Monitoring Enhanced by Multimodal Fusion

Chiori Hori, Pu Wang, Mahbub Rahman, Cristian Vaca-Rubio,Sameer Khurana, Anoop Cherian, Jonathan Le Roux

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览0
Indoor monitoring systems are in high demand to protect vulnerable people, especially when they are alone at home, in nursing homes, hospitals, etc. Although surveillance systems in public spaces use cameras and microphones to find incidents, indoor monitoring in personal spaces needs to protect privacy. Such systems thus need to understand scenes without relying on direct sensing information, e.g., from audio-visual sensors, instead using indirect sensing information that is difficult to interpret by humans and may be insufficient to understand ongoing events precisely. To mitigate this drawback, this paper proposes a new indoor monitoring approach that attempts to realize scene understanding using only indirect sensors by transferring the learned inductive bias of a multimodal fusion model trained using direct and indirect sensing information to a model that uses only indirect information during inference. We collected direct (audio-visual) and indirect (infrared and Wi-Fi) sensing information of indoor human actions in daily life and manually annotated event captions. We build models that can generate event captions from various combinations of indirect and direct sensor data, and show that our transfer learning approach leads to significant improvements in caption quality when only indirect information is used at inference time.
indoor monitoring,multimodal scene understanding,audio-visual,Wi-Fi,infrared,student-teacher learning
AI 理解论文
Chat Paper