Investigating lay evaluations of models

Patrick Bodilly Kane,Stephen B. Broomell

THINKING & REASONING（2022）

引用 0|浏览5

暂无评分

摘要

Many important decisions depend on unknown states of the world. Society is increasingly relying on statistical predictive models to make decisions in these cases. While predictive models are useful, previous research has documented that (a) individual decision makers distrust models and (b) people's predictions are often worse than those of models. These findings indicate a lack of awareness of how to evaluate predictions generally. This includes concepts like the loss function used to aggregate errors or whether error is training error or generalisation error. To address this gap, we present three studies testing how lay people visually evaluate the predictive accuracy of models. We found that (a) participant judgements of prediction errors were more similar to absolute error than squared error (Study 1), (b) we did not detect a difference in participant reactions to training error versus generalisation error (Study 2), and (c) participants rated complex models as more accurate when comparing two models, but rated simple models as more accurate when shown single models in isolation (Study 3). When communicating about models, researchers should be aware that the public's visual evaluation of models may disagree with their method of measuring errors and that many may fail to recognise overfitting.

查看译文

关键词

Model evaluation, perception, naive statistics

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要