Special CMX Seminar
In a world where artificial intelligence and data science become omnipresent, data sharing is increasingly locking horns with data-privacy concerns. Among the main data privacy concepts that have emerged are anonymization and differential privacy. Today, another solution is gaining traction-synthetic data. The goal of synthetic data is to create an as-realistic-as-possible dataset, one that not only maintains the nuances of the original data, but does so without risk of exposing sensitive information. The combination of differential privacy with synthetic data has been suggested as a best-of-both-worlds solution. However, the road to privacy is paved with NP-hard problems. The speaker will present three recent mathematical breakthroughs in the NP-hard challenge of creating synthetic data that come with provable privacy and utility guarantees and doing so computationally efficiently. These efforts draw from a wide range of mathematical concepts, particularly random processes. This is joint work with March Boedihardjo and Thomas Strohmer.