Cross Modal Video Representations for Weakly Supervised Active Speaker Localization
IEEE TRANSACTIONS ON MULTIMEDIA(2023)
Key words
Cross-modal learning,weakly supervised learning,multiple instance learning,active speaker localization.
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined