Regularization Effect of Random Node Fault/Noise on Gradient Descent Learning Algorithm

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS(2023)

引用 2|浏览12
暂无评分
摘要
For decades, adding fault/noise during training by gradient descent has been a technique for getting a neural network (NN) tolerant to persistent fault/noise or getting an NN with better generalization. In recent years, this technique has been readvocated in deep learning to avoid overfitting. Yet, the objective function of such fault/noise injection learning has been misinterpreted as the desired measure (i.e., the expected mean squared error (mse) of the training samples) of the NN with the same fault/noise. The aims of this article are: 1) to clarify the above misconception and 2) investigate the actual regularization effect of adding node fault/noise when training by gradient descent. Based on the previous works on adding fault/noise during training, we speculate the reason why the misconception appears. In the sequel, it is shown that the learning objective of adding random node fault during gradient descent learning (GDL) for a multilayer perceptron (MLP) is identical to the desired measure of the MLP with the same fault. If additive (resp. multiplicative) node noise is added during GDL for an MLP, the learning objective is not identical to the desired measure of the MLP with such noise. For radial basis function (RBF) networks, it is shown that the learning objective is identical to the corresponding desired measure for all three fault/noise conditions. Empirical evidence is presented to support the theoretical results and, hence, clarify the misconception that the objective function of a fault/noise injection learning might not be interpreted as the desired measure of the NN with the same fault/noise. Afterward, the regularization effect of adding node fault/noise during training is revealed for the case of RBF networks. Notably, it is shown that the regularization effect of adding additive or multiplicative node noise (MNN) during training an RBF is reducing network complexity. Applying dropout regularization in RBF networks, its effect is the same as adding MNN during training.
更多
查看译文
关键词
Training, Artificial neural networks, Additives, Noise measurement, Multi-layer neural network, Linear programming, Radial basis function networks, Dropout, learning objective, node fault, node noise, regularization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要