Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks
CoRR(2024)
摘要
This paper examines gradient flow dynamics of two-homogeneous neural networks
for small initializations, where all weights are initialized near the origin.
For both square and logistic losses, it is shown that for sufficiently small
initializations, the gradient flow dynamics spend sufficient time in the
neighborhood of the origin to allow the weights of the neural network to
approximately converge in direction to the Karush-Kuhn-Tucker (KKT) points of a
neural correlation function that quantifies the correlation between the output
of the neural network and corresponding labels in the training data set. For
square loss, it has been observed that neural networks undergo saddle-to-saddle
dynamics when initialized close to the origin. Motivated by this, this paper
also shows a similar directional convergence among weights of small magnitude
in the neighborhood of certain saddle points.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要