What's documented in AI? Systematic Analysis of 32K AI Model Cards
CoRR(2024)
摘要
The rapid proliferation of AI models has underscored the importance of
thorough documentation, as it enables users to understand, trust, and
effectively utilize these models in various applications. Although developers
are encouraged to produce model cards, it's not clear how much information or
what information these cards contain. In this study, we conduct a comprehensive
analysis of 32,111 AI model documentations on Hugging Face, a leading platform
for distributing and deploying AI models. Our investigation sheds light on the
prevailing model card documentation practices. Most of the AI models with
substantial downloads provide model cards, though the cards have uneven
informativeness. We find that sections addressing environmental impact,
limitations, and evaluation exhibit the lowest filled-out rates, while the
training section is the most consistently filled-out. We analyze the content of
each section to characterize practitioners' priorities. Interestingly, there
are substantial discussions of data, sometimes with equal or even greater
emphasis than the model itself. To evaluate the impact of model cards, we
conducted an intervention study by adding detailed model cards to 42 popular
models which had no or sparse model cards previously. We find that adding model
cards is moderately correlated with an increase weekly download rates. Our
study opens up a new perspective for analyzing community norms and practices
for model documentation through large-scale data science and linguistics
analysis.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要