谷歌Chrome浏览器插件
订阅小程序
在清言上使用

A global analysis of global optimisation

CoRR(2022)

引用 0|浏览22
暂无评分
摘要
We introduce a general theoretical framework, designed for the study of gradient optimisation of deep neural networks, that encompasses ubiquitous architectural choices including batch normalisation, weight normalisation and skip connections. We use our framework to conduct a global analysis of the curvature and regularity properties of neural network loss landscapes induced by normalisation layers and skip connections respectively. We then demonstrate the utility of this framework in two respects. First, we give the only proof of which we are presently aware that a class of deep neural networks can be trained using gradient descent to global optima even when such optima only exist at infinity, as is the case for the cross-entropy cost. Second, we verify a prediction made by the theory, that skip connections accelerate training, with ResNets on MNIST, CIFAR10, CIFAR100 and ImageNet.
更多
查看译文
关键词
global optimisation,global analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要