Sasha Rakhlin, MIT
We will discuss the problem of generalization for neural networks. First, we will study complexity measures of neural networks that control the gap between empirical and expected performance. We will discuss new dimension-free upper bounds on the supremum of the associated empirical process. Next, we will turn to the “overfitted” regime where neural networks have enough flexibility to fit the data exactly (that is, interpolate). We will challenge the conventional wisdom that interpolation is necessarily a bad statistical procedure. In particular, we will consider the minimum norm interpolant in a reproducing kernel Hilbert space and show its good estimation properties in the high-dimensional regime, given favorable spectral decays of the covariance and kernel matrices. Lower bounds indicate that success of interpolation in RKHS is a high-dimensional phenomenon. Extending these interpolation results to neural networks remains an open question.