Fast Rate Generalization Error Bounds: Variations on a Theme

2022 IEEE Information Theory Workshop (ITW)(2022)

引用 2|浏览1
暂无评分
摘要
A recent line of works, initiated by [1] and [2], has shown that the generalization error of a learning algorithm can be upper bounded by information measures. In most of the relevant works, the convergence rate of the expected generalization error is in the form of $O(\sqrt {\lambda I/n} )$ where λ is an assumption-dependent coefficient and I is some information-theoretic quantities such as the mutual information between the data sample and the learned hypothesis. However, such a learning rate is typically considered to be "slow", compared to a "fast rate" of O(1 /n) in many learning scenarios. In this work, we first show that the square root does not necessarily imply a slow rate, and a fast rate result can still be obtained using this bound by evaluating λ under an appropriate assumption. Furthermore, we identify the key conditions needed for the fast rate generalization error, which we call the ( η, c)-central condition. Under this condition, we give information-theoretic bounds on the generalization error and excess risk, with a convergence rate of O (1 /n) for specific learning algorithms such as empirical risk minimization. Finally, analytical examples are given to show the effectiveness of the bounds.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要