CML: A Contrastive Meta Learning Method to Estimate Human Label Confidence Scores and Reduce Data Collection Cost

Bo Dong, Yiyi Wang,Hanbo Sun, Yunji Wang,Alireza Hashemi,Zheng Du

PROCEEDINGS OF THE 5TH WORKSHOP ON E-COMMERCE AND NLP (ECNLP 5)(2022)

引用 1|浏览15
暂无评分
摘要
Deep neural network models are especially susceptible to noise in annotated labels. In the real world, annotated data typically contains noise caused by a variety of factors such as task difficulty, annotator experience, and annotator bias. Label quality is critical for label validation tasks; however, correcting for noise by collecting more data is often costly. In this paper, we propose a contrastive meta-learning framework (CML) to address the challenges introduced by noisy annotated data, specifically in the context of natural language processing. CML combines contrastive and meta learning to improve the quality of text feature representations. Meta-learning is also used to generate confidence scores to assess label quality. We demonstrate that a model built on CML-filtered data outperforms a model built on clean data. Furthermore, we perform experiments on de-identified commercial voice assistant datasets and demonstrate that our model outperforms several SOTA approaches.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要