Gradient free stochastic training of ANNs, with local approximation in partitions

N. P. Bakas,A. Langousis,M. A. Nicolaou,S. A. Chatzichristofis

STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT（2023）

引用 1|浏览7

暂无评分

摘要

We present a numerical scheme for computation of Artificial Neural Networks (ANN) weights, which stems from the Universal Approximation Theorem, avoiding costly iterations. The proposed algorithm adheres to the underlying theory, is highly fast, and results in remarkably low errors when applied to regression and classification problems of complex data sets with x∈ℝ^n (e.g. Griewank, Gomez-Levy, Shekel, and Polynomial functions) with random noise addition (i.e. Uniform, Normal, Generalized Pareto, Log-Normal, and a mixture of Log-Normal, Exponential, and Frechet), as well as the database for handwritten digits recognition MNIST (Modified National Institute of Standards and Technology) with 7× 10^4 images. The same mathematical formulation was found capable of approximating highly nonlinear functions in multiple dimensions, with low errors (e.g. 10^-10 ) for the test set of the unknown functions, their higher-order partial derivatives, as well as numerically solving Partial Differential Equations, such as those appearing in Physics, Engineering, Environmental Sciences, etc. The method is based on the calculation of the weights of each neuron in small neighbourhoods of the data. Accordingly, optimization of hyperparameters is not necessary, as the number of neurons stems directly from the dimensionality of the data, further improving the algorithmic speed. Under this setting, overfitting is inherently avoided, and the results are interpretable and reproducible. The complexity of the proposed algorithm is of class P with 𝒪(mNni_cl + Nmn^2+Nn^3 + mN^2+N^3) computing time, with respect to the observations m , features n , and Neurons N , contrary to the NP-Complete class of standard algorithms for ANN training. The performance of the method is high, irrespective of the size of the data set, and the test set errors are similar or smaller than the training errors, indicating the generalization efficiency of the algorithm. A supplementary computer code in Julia and Python Languages is provided, which can be used to reproduce the validation examples, and/or apply the algorithm to other data sets.

查看译文

关键词

Artificial neural networks,Learning algorithms,Classification,Regression analysis,Radial basis function networks,Partial differential equations

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要