Zero-Shot Transfer for Wildlife Bioacoustics Detection

Research Square (Research Square)(2023)

引用 0|浏览0
暂无评分
摘要
Abstract Automatically detecting sound events with Artificial Intelligence (AI) has become increasingly popular in the field of bioacoustics, particularly for wildlife monitoring and conservation. Conventional methods predominantly employ supervised learning techniques that depend on substantial amounts of manually annotated bioacoustics data. However, manual annotation in bioacoustics is tremendously resource-intensive, both in terms of human labor and financial resources, and requires considerable domain expertise. This consequently undermines the validity of crowdsourcing annotation methods, such as Amazon Mechanical Turk. Additionally, the supervised learning framework restricts application scope to predefined categories within a closed setting. To address these challenges, we present a novel approach leveraging a multi-modal contrastive learning technique called Contrastive Language-Audio Pretraining (CLAP). CLAP allows for flexible class definition during inference through the use of descriptive text prompts and is capable of performing Zero-Shot Transfer on previously unencountered datasets. In this study, we demonstrate that without specific fine-tuning or additional training, an out-of-the-box CLAP model can effectively generalize across 9 bioacoustics benchmarks, covering a wide variety of sounds that are unfamiliar to the model. We show that CLAP achieves comparable, if not superior, recognition performance compared to supervised learning baselines that are fine-tuned on the training data of these benchmarks. Our experiments also indicate that CLAP holds the potential to perform tasks previously unachievable in supervised bioacoustics approaches, such as foreground / background sound separation and the discovery of unknown animals. Consequently, CLAP offers a promising foundational alternative to traditional supervised learning methods for bioacoustics tasks, facilitating more versatile applications within the field.
更多
查看译文
关键词
wildlife bioacoustics detection,zero-shot
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要