Parameter uncertainties for imperfect surrogate models in the low-noise regime
arxiv(2024)
摘要
Bayesian regression determines model parameters by minimizing the expected
loss, an upper bound to the true generalization error. However, the loss
ignores misspecification, where models are imperfect. Parameter uncertainties
from Bayesian regression are thus significantly underestimated and vanish in
the large data limit. This is particularly problematic when building models of
low- noise, or near-deterministic, calculations, as the main source of
uncertainty is neglected. We analyze the generalization error of misspecified,
near-deterministic surrogate models, a regime of broad relevance in science and
engineering. We show posterior distributions must cover every training point to
avoid a divergent generalization error and design an ansatz that respects this
constraint, which for linear models incurs minimal overhead. This is
demonstrated on model problems before application to thousand dimensional
datasets in atomistic machine learning. Our efficient misspecification-aware
scheme gives accurate prediction and bounding of test errors where existing
schemes fail, allowing this important source of uncertainty to be incorporated
in computational workflows.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要