Chrome Extension
WeChat Mini Program
Use on ChatGLM

Self-Supervised Learning for Alignment of Objects and Sound

Xinzhu Liu, Xiaoyu Liu, Di Guo, Huaping Liu, Fuchun Sun, Haibo Min

ICRA(2020)

Cited 2|Views94
No score
Abstract
The sound source separation problem has many useful applications in the field of robotics, such as human-robot interaction, scene understanding, etc. However, it remains a very challenging problem. In this paper, we utilize both visual and audio information of videos to perform the sound source separation task. A self-supervised learning framework is proposed to implement the object detection and sound separation modules simultaneously. Such an approach is designed to better find the alignment between the detected objects and separated sound components. Our experiments, conducted on both the synthetic and real datasets, validate this approach and demonstrate the effectiveness of the proposed model in the task of object and sound alignment.
More
Translated text
Key words
human-robot interaction,scene understanding,sound source separation task,self-supervised learning framework,object detection,sound separation modules,sound components,visual information,audio information
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined