HcLSH: A Novel Non-Linear Monotonic Activation Function for Deep Learning Methods

Heba Abdel-Nabi,Ghazi Al-Naymat,Mostafa Z. Ali,Arafat Awajan

IEEE Access（2023）

引用 2|浏览5

暂无评分

摘要

Activation functions are essential components in any neural network model; they play a crucial role in determining the network's expressive power through their introduced non-linearity. Rectified Linear Unit (ReLU) has been the famous and default choice for most deep neural network models because of its simplicity and ability to tackle the vanishing gradient problem that faces backpropagation optimization. However, ReLU introduces other challenges that hinder its performance; bias shift and dying neurons in the negative region. To address these problems, this paper presents a novel composite monotonic, zero-centered, semi-saturated activation function called Hyperbolic cosine Linearized SquasHing function (HcLSH) with partial gradient-based sparsity HcLSH owns many desirable properties, such as considering the contribution of the negative values of neurons while having a smooth output landscape to enhance the gradient flow during training. Furthermore, the regularization effect resulting from the self-gating property of the positive region of HcLSH reduces the risk of model overfitting and ensures learning more robust expressive representations. An extensive set of experiments and comparisons is conducted that includes four popular image classification datasets, seven deep network architectures, and ten state-of-the-art activation functions. HcLSH exhibited the Top-1 and Top-3 testing accuracy results in 20 and 25 out of 28 conducted experiments, respectively, suppressing the widely used ReLU that achieved 2 and 5, and the reputable Mish that achieved 0 and 5 Top-1 and Top-3 testing accuracy results, respectively. HcLSH attained improvements over ReLU, ranging from 0.2% to 96.4% in different models and datasets. Statistical results demonstrate the significance of the enhanced performance achieved by our proposed HcLSH activation function compared to the competitive activation functions in various datasets and models regarding the testing loss Furthermore, the ablation study further verifies the proposed activation function's robustness, stability, and adaptability for the different model parameter.

查看译文

关键词

Activation function,convergence,deep learning,image classification accuracy,monotonicity,saturation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要