DynamicLabels: Supporting Informed Construction of Machine Learning Label Sets with Crowd Feedback.

Jeongeon Park,Eun-Young Ko, Yeon Su Park, Jinyeong Yim,Juho Kim

International Conference on Intelligent User Interfaces(2024)

Cited 0|Views3
No score
Abstract
Label set construction—deciding on a group of distinct labels—is an essential stage in building a supervised machine learning (ML) application, as a badly designed label set negatively affects subsequent stages, such as training dataset construction, model training, and model deployment. Despite its significance, it is challenging for ML practitioners to come up with a well-defined label set, especially when no external references are available. Through our formative study (n=8), we observed that even with the help of external references or domain experts, ML practitioners still need to go through multiple iterations to gradually improve the label set. In this process, there exist challenges in collecting helpful feedback and utilizing it to make optimal refinement decisions. To support informed refinement, we present DynamicLabels, a system that aims to support a more informed label set-building process with crowd feedback. Crowd workers provide annotations and label suggestions to the ML practitioner’s label set, and the ML practitioner can review the feedback through multi-aspect analysis and refine the label set with crowd-made labels. Through a within-subjects study (n=16) using two datasets, we found that DynamicLabels enables better understanding and exploration of the collected feedback and supports a more structured and flexible refinement process. The crowd feedback helped ML practitioners explore diverse perspectives, spot current weaknesses, and shop from crowd-generated labels. Metrics and label suggestions in DynamicLabels helped in obtaining a high-level overview of the feedback, gaining assurance, and spotting surfacing conflicts and edge cases that could have been overlooked.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined