Predicting neural network confidence using high-level feature distance.

Jie Wang,Jun Ai,Minyan Lu,Jingyu Liu,Zili Wu

Inf. Softw. Technol.（2023）

引用 1|浏览24

暂无评分

摘要

Context: Neural networks have achieved state-of-the-art performance in many fields. However, they are often reported to produce overconfident predictions, especially for misclassifications. Therefore, confidence prediction is vitally important in practical applications to enable models to provide reasonable confidence. Objective: The objective of this paper is to address the problem of overconfidence in neural networks. This is achieved by constructing a detector that can predict the probability of incorrect output of the neural network. The goal of the detector is to identify incorrect outputs and adjust their raw confidences to reduce the overconfidence of misclassification.Method: The idea of the detector is to learn the relationship between high-level features of inputs and their classification correctness. The high-level feature is the output of the deep hidden layer in the network, and for the CNNs, we chose the last convolutional layer. The training of the detector requires a hold-out validation set, which in practice can be the same set used for hyperparameter tuning. The detector predicts which inputs are likely to be misclassified by neural networks and estimates the probability of misclassification which is then used to adjust the raw softmax confidence, thereby reducing the confidence of misclassification. The detector is learned by deeply mining the classification results of the validation data produced during the training process of the neural network, no additional data such as disturbance samples needs to be collected.Results: Experimental results on the CIFAR-10 dataset and two typical neural network structures, ResNet20 and VGG16, show that our method is effective in reducing the confidence of misclassifications and maintaining the confidence of correct classifications. The effectiveness of our method is demonstrated on the non -disturbance i.i.d test set and three types of disturbance sets. It outperforms two baseline methods on all test sets, especially for the i.i.d set, on the misclassification identification task.Conclusion: This new method is proven to perform well on both misclassification identification and out -of-distribution detection tasks. In contrast to previous softmax calibration methods that aim to decrease the confidence of all classifications, the method proposed in this paper innovatively reduces the confidence of misclassification straightforwardly. As a result, it becomes feasible to visually interpret the correctness of classifications using confidence scores. This will lead to a better understanding of the model's behavior and facilitate more reliable decision-making. Overall, our proposed confidence prediction method represents a promising step towards addressing the overconfidence problem in classification tasks in deep learning, especially image classification, and is of great value for real-world deep learning applications.

查看译文

关键词

neural network confidence,neural network,distance,feature,high-level

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要