Co-ML: Collaborative Machine Learning Model Building for Developing Dataset Design Practices
CoRR(2023)
摘要
Machine learning (ML) models are fundamentally shaped by data, and building
inclusive ML systems requires significant considerations around how to design
representative datasets. Yet, few novice-oriented ML modeling tools are
designed to foster hands-on learning of dataset design practices, including how
to design for data diversity and inspect for data quality.
To this end, we outline a set of four data design practices (DDPs) for
designing inclusive ML models and share how we designed a tablet-based
application called Co-ML to foster learning of DDPs through a collaborative ML
model building experience. With Co-ML, beginners can build image classifiers
through a distributed experience where data is synchronized across multiple
devices, enabling multiple users to iteratively refine ML datasets in
discussion and coordination with their peers.
We deployed Co-ML in a 2-week-long educational AIML Summer Camp, where youth
ages 13-18 worked in groups to build custom ML-powered mobile applications. Our
analysis reveals how multi-user model building with Co-ML, in the context of
student-driven projects created during the summer camp, supported development
of DDPs including incorporating data diversity, evaluating model performance,
and inspecting for data quality. Additionally, we found that students' attempts
to improve model performance often prioritized learnability over class balance.
Through this work, we highlight how the combination of collaboration, model
testing interfaces, and student-driven projects can empower learners to
actively engage in exploring the role of data in ML systems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要