Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe

Xiang Yue,Huseyin A. Inan,Xuechen Li,Girish Kumar,Julia McAnallen,Huan Sun,David Levitan,Robert Sim

conf_acl（2022）

引用 27|浏览75

暂无评分

摘要

Privacy concerns have attracted increasing attention in data-driven products and services. Existing legislation forbids arbitrary processing of personal data collected from individuals. Generating synthetic versions of such data with a formal privacy guarantee such as differential privacy (DP) is considered to be a solution to address privacy concerns. In this direction, we show a simple, practical, and effective recipe in the text domain: simply fine-tuning a generative language model with DP allows us to generate useful synthetic text while mitigating privacy concerns. Through extensive empirical analyses, we demonstrate that our method produces synthetic data that is competitive in terms of utility with its non-private counterpart and meanwhile provides strong protection against potential privacy leakages.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要