Differentially Private Kernel Inducing Points using features from ScatterNets (DP-KIP-ScatterNet) for Privacy Preserving Data Distillation
arxiv(2023)
摘要
Data distillation aims to generate a small data set that closely mimics the
performance of a given learning algorithm on the original data set. The
distilled dataset is hence useful to simplify the training process thanks to
its small data size. However, distilled data samples are not necessarily
privacy-preserving, even if they are generally humanly indiscernible. To
address this limitation, we introduce differentially private kernel inducing
points (DP-KIP) for privacy-preserving data distillation. Unlike our original
intention to simply apply DP-SGD to the framework of KIP, we find that KIP
using infinitely-wide convolutional neural tangent kernels (conv-NTKs) performs
better compared to KIP using fully-connected NTKs. However, KIP with conv-NTKs,
due to its convolutional and pooling operations, introduces an unbearable
computational complexity, requiring hundreds of V100 GPUs in parallel to train,
which is impractical and more importantly, such computational resources are
inaccessible to many. To overcome this issue, we propose an alternative that
does not require pre-training (to avoid a privacy loss) and can well capture
complex information on images, as those features from conv-NKTs do, while the
computational cost is manageable by a single V100 GPU. To this end, we propose
DP-KIP-ScatterNet, which uses the wavelet features from Scattering networks
(ScatterNet) instead of those from conv-NTKs, to perform DP-KIP at a reasonable
computational cost. We implement DP-KIP-ScatterNet in – computationally
efficient – JAX and test on several popular image datasets to show its
efficacy and its superior performance compared to state-of-the art methods in
image data distillation with differential privacy guarantees.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要