Chrome Extension
WeChat Mini Program
Use on ChatGLM

FakeDB: Generating Fake Synthetic Databases

IEEE Transactions on Dependable and Secure Computing(2024)

Cited 0|Views14
No score
Abstract
Health care providers may wish to share limited information with researchers. Manufacturing companies may want to share some but not all data with regulators or partners. Since the emergence of generative adversarial networks (GANs), efforts have been made to generate synthetic data that preserves semantic properties on the one hand and distributions on the other hand. However, all past efforts focus on a single table at a time. We propose FakeDB, a general framework to generate synthetic data that preserves a a wide variety of semantic integrity constraints as well as a broad set of statistical properties, across an entire relational database. We compare FakeDB with natural extensions of prior work on 8 well known relational databases as well as on a synthetically generated dataset, and show that FakeDB outperforms them. We also show that FakeDB runs in reasonable amounts of time, making it a practical solution to the problem of generating synthetic data.
More
Translated text
Key words
H.2.0.a Security, integrity, and protection $\lt $ H.2.0 General $\lt $ H.2 Database Management $\lt $ H Information Technology and Systems,H.2.4.i Relational databases $\lt $ H.2.4 Systems $\lt $ H.2 Database Management $\lt $ H Information Technology and Systems
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined