Multi-Class Ground Truth Inference in Crowdsourcing with Clustering
IEEE Transactions on Knowledge and Data Engineering(2016)
摘要
Due to low quality of crowdsourced labelers, the integrated label of each example is usually inferred from its multiple noisy labels provided by different labelers. This paper proposes a novel algorithm, Ground Truth Inference using Clustering (GTIC), to improve the quality of integrated labels for multi-class labeling. For a K labeling case, GTIC utilizes the multiple noisy label sets of examples to generate features. Then, it uses a K-Means algorithm to cluster all examples into K different groups, each of which is mapped to a specific class. Examples in the same cluster are assigned a corresponding class label. We compare GTIC with four existing multi-class ground truth inference algorithms, majority voting (MV), Dawid u0026 Skeneu0027s (DS), ZenCrowd (ZC) and Spectral DS (SDS), on one synthetic and eight real-world datasets. Experimental results show that the performance of GTIC is significantly superior to the others in terms of both accuracy and M-AUC. Besides, the running time of GTIC is about twenty times faster than EM-based complicated inference algorithms.
更多查看译文
关键词
inference mechanisms,pattern clustering,ubiquitous computing,Dawid-Skene algorithm,GTIC algorithm,MV algorithm,SDS algorithm,ZC algorithm,ZenCrowd algorithm,crowdsourcing,ground truth inference using clustering algorithm,k-means algorithm,majority voting algorithm,multiclass ground truth inference algorithms,multiclass labeling,spectral DS algorithm,Clustering,EM algorithm,clustering,crowdsourcing,ground truth inference,multi-class labeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络