The effect of activation functions on accuracy, convergence speed, and misclassification confidence in CNN text classification: a comprehensive exploration

JOURNAL OF SUPERCOMPUTING(2024)

引用 0|浏览7
暂无评分
摘要
Convolutional neural networks (CNNs) have become a useful tool for a wide range of applications such as text classification. However, CNNs are not always sufficiently accurate to be useful in certain applications. The selection of activation functions within CNN architecture can affect the efficacy of the CNN. However, there is limited research regarding which activation functions are best for CNN text classification. This study tested sixteen activation functions across three text classification datasets and six CNN structures, to determine the effects of activation function on accuracy, iterations to convergence, and Positive Confidence Difference (PCD). PCD is a novel metric introduced to compare how activation functions affected a network's classification confidence. Tables were presented to compare the performance of the activation functions across the different CNN architectures and datasets. Top performing activation functions across the different tests included the symmetrical multi-state activation function, sigmoid, penalised hyperbolic tangent, and generalised swish. An activation function's PCD was the most consistent evaluation metric during activation function assessment, implying a close relationship between activation functions and network confidence that has yet to be explored.
更多
查看译文
关键词
Activation functions,Convolutional neural networks,Text classification,Machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要