Addressing Stability in Classifier Explanations.

Siavash Samiei,Nasrin Baratalipour,Pranjul Yadav, Amitabha Roy,Dake He

IEEE BigData(2021)

引用 0|浏览9
暂无评分
摘要
Machine learning based classifiers are often a black box when considering the contribution of inputs to the output probability of a label, especially with complex non-linear models such as neural networks. A popular way to explain machine learning model outputs in a model agnostic manner is through the use of Shapley values. For our use case of abuse fighting in digital advertisements, one primary impediment of using Shapley values in explanations was a problem of instability. Specifically, the instability problem manifests as explanations for the same example varying greatly due to random sampling in the algorithm. We found it useful to view this problem explicitly as Monte Carlo integration in the form of averaging the model output while varying only a subset of features in the example to be explained. In turn, this guides the number of samples needed to achieve a stable estimate of individual Shapley values and unlocked the use of Shapley value based explainers for our models as well as classifiers in general, including neural networks.
更多
查看译文
关键词
Shapley values,model explanations,neural network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要