A global analysis of global optimisation

Lachlan Ewen MacDonald,Hemanth Saratchandran,Jack Valmadre,Simon Lucey

CoRR（2022）

引用 0|浏览22

暂无评分

摘要

We introduce a general theoretical framework, designed for the study of gradient optimisation of deep neural networks, that encompasses ubiquitous architectural choices including batch normalisation, weight normalisation and skip connections. We use our framework to conduct a global analysis of the curvature and regularity properties of neural network loss landscapes induced by normalisation layers and skip connections respectively. We then demonstrate the utility of this framework in two respects. First, we give the only proof of which we are presently aware that a class of deep neural networks can be trained using gradient descent to global optima even when such optima only exist at infinity, as is the case for the cross-entropy cost. Second, we verify a prediction made by the theory, that skip connections accelerate training, with ResNets on MNIST, CIFAR10, CIFAR100 and ImageNet.

查看译文

关键词

global optimisation,global analysis

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要