EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation
CoRR(2024)
摘要
We release the EARS (Expressive Anechoic Recordings of Speech) dataset, a
high-quality speech dataset comprising 107 speakers from diverse backgrounds,
totaling in 100 hours of clean, anechoic speech data. The dataset covers a
large range of different speaking styles, including emotional speech, different
reading styles, non-verbal sounds, and conversational freeform speech. We
benchmark various methods for speech enhancement and dereverberation on the
dataset and evaluate their performance through a set of instrumental metrics.
In addition, we conduct a listening test with 20 participants for the speech
enhancement task, where a generative method is preferred. We introduce a blind
test set that allows for automatic online evaluation of uploaded data. Dataset
download links and automatic evaluation server can be found online.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要