Data privacy

Privacy-Enhancing Technologies for Synthetic Data Creation with Deep Generative Models

In light of the recent technological advancements, our society has evolved into a prolific source of data generation, accompanied by the widespread use of machine learning algorithms, particularly deep neural networks. However, these algorithms rely on substantial datasets which often contain sensitive and private information.

Within this context, generative models have emerged to create synthetic samples across various domains. Ideally, these models should prevent the exposure of individual-specific information from the training data. Unfortunately, recent literature has shown that this assumption is not consistently met, particularly with Generative Adversarial Networks (GANs), which lacks of robust privacy guarantees.

Nevertheless, there is a critical need to strike a balance between our responsibilities as data stewards and the importance to advance data mining research. In this regard, Privacy-Enhancing Technologies (PETs) can help mitigate these challenges by imposing privacy constraints on ML models or more generally in algorithms, enabling their use and sharing without compromising the confidentiality of the training data.

This research is dedicated to exploring the latest techniques in the field of privacy, leveraging differentially private synthetic data and investigating the trade-off between data utility and privacy preservation. The outcomes of this study have the potential to:

• Enhance the understanding of the latest techniques for generating synthetic data while respecting the principles of differential privacy. • Provide insights about the trade-off between data utility and privacy preservation, specifi- cally in the context of generative models. • Furnishguidancetoresearchers,organizations,andpolicymakersonthepracticalapplication of differential privacy-enhanced synthetic data. • Contributetothedevelopmentofbestpracticesforleveragingsyntheticdataindata-driven tasks while adhering to stringent privacy regulations.

Alessio Crisafulli Carpani
Étudiant des cycles supérieurs en Sciences Statistiques

Nous avons besoin de la puissance des données et du machine learning pour répondre efficacement aux besoins d’aujourd’hui

Précédent