Qi Lei

, Princeton University

Title: Theoretical Foundations of Pre-trained Models

Monday, March 14, 2022 4:00PM to 5:00PM

Via Zoom: https://yale.zoom.us/j/95544153142

Information and Abstract:

A pre-trained model refers to any model trained on broad data at scale and can be adapted (e.g., fine-tuned) to a wide range of downstream tasks. The rise of pre-trained models (e.g., BERT, GPT-3, CLIP, Codex, MAE) transforms applications in various domains, especially where labeled data is scarce. A pre-trained model first learns a data representation that filters out irrelevant information from the training tasks; it then transfers the data representation to downstream tasks with few labeled samples and slight modifications.
This talk establishes some theoretical understanding for pre-trained models under different settings, ranging from supervised pre-training and meta-learning to self-supervised learning. I will discuss the conditions for pre-trained models to work based on the statistical relation between training and downstream tasks. The theoretical analyses partly answer how they work, when they fail, guide technical decisions for future work, and inspire new methods in pre-trained models.

Department of Statistics and Data Science

Qi Lei

Qi Lei

Department of Statistics and Data Science