Naive Bayes Classification under Local Differential Privacy

2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)(2020)

引用 5|浏览5
暂无评分
摘要
Supervised learning techniques such as classification algorithms learn from training data to predict the correct label for newly presented input data. In many real-world scenarios, training data required by such techniques can contain personal information and data collection can be a significant problem due to privacy concerns. Cryptographic techniques have been used before to do training on encrypted data. However, such techniques are computationally expensive and they are not scalable most of the time. If a dataset in another party will be used for training, differential privacy technology can be used to preserve the privacy of the individuals in the dataset. When there is no such dataset and data needs to be collected from individuals directly for training, local differential privacy can be used. Local differential privacy is a technology to preserve privacy during data sharing with an untrusted data collector. In this work, we propose to use local differential privacy techniques to train a Naive Bayes classifier. Using the proposed solution, an untrusted party collects perturbed data from individuals that keep the relationship between the feature values and class labels. By estimating probabilities needed by the Naive Bayes classifier using the perturbed data, the untrusted party can classify new instances with high accuracy. We develop solutions that work for both discrete and continuous data. We also propose utilizing dimensionality reduction techniques to decrease communication cost and improve accuracy. We show the accuracy of the proposed Naive Bayes classifier achieving local differential privacy via experiments on several datasets. We also show how dimensionality reduction enhances the accuracy.
更多
查看译文
关键词
Local Differential Privacy,Naive Bayes,Classification,Dimensionality Reduction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要