Efficiently learning and sampling from multimodal distributions using data-based initialization

Mon Mar 24, 2025 4:00 p.m.—5:00 p.m.

This event has passed.

Speaker

Thuy-Duong “June” Vuong, Miller Institute, Berkeley

Learning to sample is a central task in generative AI: the goal is to generate (infinitely many more) samples from a target distribution $\mu$ given a small number of samples from $\mu.$ It is well-known that traditional algorithms such as Glauber or Langevin dynamics are highly inefficient when the target distribution is multimodal, as they take exponential time to converge from a \emph{worst case start}, while recently proposed algorithms such as denoising diffusion (DDPM) require information that is computationally hard to learn. In this talk, we propose a novel and conceptually simple algorithmic framework to learn multimodal target distributions by initializing traditional sampling algorithms at the empirical distribution. As applications, we show new results for two representative distribution families: Gaussian mixtures and Ising models. When the target distribution $\mu$ is a mixture of $k$ well-conditioned Gaussians, we show that the (continuous) Langevin dynamics initialized from the empirical distribution over $\tilde{O}(k/\epsilon^2)$ samples, with high probability over the samples, converge to $\mu” in $\tilde{O}(1)$-time; both the number of samples and convergence time are optimal. When $\mu$ is a low-complexity Ising model, we show a similar result for the Glauber dynamics with approximate marginals learned via pseudolikelihood estimation, demonstrating for the first time that such low-complexity Ising models can be efficiently learned from samples.”

Based on joint work with Frederic Koehler and Holden Lee.

3:30pm - Pre-talk meet and greet teatime - 219 Prospect Street, 13 floor, there will be light snacks and beverages in the kitchen area.

This event could be viewed on Webcast