Jason Altschuler, NYU
A central issue in machine learning is how to train models on sensitive user data. Industry has widely adopted a simple algorithm: Stochastic Gradient Descent with noise (aka Stochastic Gradient Langevin Dynamics). However, foundational questions about this algorithm’s privacy loss remain open—even in the seemingly simple setting of smooth convex losses over a bounded domain. Our main result resolves these questions by characterizing the differential privacy up to a constant factor. This result reveals that all previous analyses for this setting have the wrong qualitative behavior. Specifically, while previous privacy analyses increase ad infinitum in the number of iterations, we show that after a small burn-in period, running SGD longer leaks no further privacy.
In this talk, I will describe this result and our analysis—which departs from previous approaches based on fast mixing, instead using techniques based on optimal transport.
Joint work with Kunal Talwar