Differential Description Length for Hyperparameter Selection in Supervised Learning.

Mojtaba Abolfazli,Anders Høst-Madsen,June Zhang

ISITA（2020）

引用 0|浏览10

暂无评分

摘要

Minimum description length (MDL) is an established method for model selection. For supervised learning problems, cross-validation is often used for model selection in practice. Reasons are 1) MDL is difficult to apply directly to data; 2) MDL may make restrictive statistical assumptions that decrease performance; and 3) MDL does not directly aim to minimize generalization error. In this paper, we introduce a modification to MDL, which we call differential description length (DDL). DDL partitions the data so that the codelength(s) it computes, reflects the conditional probability of seeing ‘new’ data given ‘old’ data. This differential codelength is what allows DDL to estimate generalization error like cross-validation. DDL is also better than cross-validation because it allows the learning algorithm to use the entire data without having to withhold subsets for validation and testing. Compared with MDL, DDL has both better performance (in finding models with smaller generalization error) and is easier to compute. Experiments with linear regression and deep neural networks show that DDL also outperforms cross-validation.

查看译文

关键词

hyperparameter selection,learning,length,description

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要