# Courses: Fall 2017/Spring 2018

Courses numbered 600 or above (such as Stat 610a) are intended primarily for graduate students. If such a course does not have an undergraduate cross-listing, undergraduates need special permission to enroll. A different course summary page is available here.

Director of Undergraduate Studies: Dan Spielman and Sekhar Tatikonda.

Director of Graduate Studies: John Emerson and Andrew Barron

Course requirements for Ph.D. and M.A. students

**Undergraduate**

### STAT 101a - 106a / STAT 501a - 506a Introduction to Statistics

A basic introduction to statistics, including numerical and graphical summaries of data, probability, hypothesis testing, confidence intervals, and regression. Each course focuses on applications to a particular field of study and is taught jointly by two instructors, one specializing in statistics and the other in the relevant area of application. The first seven weeks of classes are attended by all students in STAT 101-106 together, as general concepts and methods of statistics are developed. The remaining weeks are divided into field-specific sections that develop the concepts with examples and applications. Computers are used for data analysis. These courses are alternatives; they do not form a sequence and only one may be taken for credit. No prerequisites beyond high school algebra. May not be taken after STAT 100 or 109.

Students enrolled in STAT 101-106 who wish to change to STAT 109, or those enrolled in STAT 109 who wish to change to STAT 101-106, must submit a course change notice, signed by the instructor, to their residential college dean by Friday, September 27. The approval of the Committee on Honors and Academic Standing is not required.

### STAT 101a / E&EB 210a / E&EB 510a Introduction to Statistics: Life Sciences

Statistical and probabilistic analysis of biological problems presented with a unified foundation in basic statistical theory. Problems are drawn from genetics, ecology, epidemiology, and bioinformatics.

### STAT 102a / STAT 502a / EP&E 203a / PLSC 425a Introduction to Statistics: Political Science

Statistical analysis of politics and quantitative assessments of public policies. Problems presented with reference to a wide array of examples: public opinion, campaign finance, racially motivated crime, and health policy.

### STAT 103a / STAT 503a / SOCY 119a Introduction to Statistics: Social Sciences

Descriptive and inferential statistics applied to analysis of data from the social sciences. Introduction of concepts and skills for understanding and conducting quantitative research.

### STAT 105a / STAT 505a Introduction to Statistics: Medicine

Statistical methods used in medicine and medical research. Practice in reading medical literature competently and critically, as well as practical experience performing statistical analysis of medical data.

### STAT 109a Introduction to Statistics: Fundamentals

General concepts and methods in statistics. Meets for the first half of the term only. May not be taken after STAT 100 or 101-106.

### STAT 230a / STAT 530a/ PLSC 530a Introductory Data Analysis

Survey of statistical methods: plots, transformations, regression, analysis of variance, clustering, principal components, contingency tables, and time series analysis. The R computing language and Web data sources are used.

### STAT 238a / STAT 538a Probability and Statistics

Fundamental principles and techniques of probabilistic thinking, statistical modeling, and data analysis. Essentials of probability, including conditional probability, random variables, distributions, law of large numbers, central limit theorem, and Markov chains. Statistical inference with emphasis on the Bayesian approach: parameter estimation, likelihood, prior and posterior distributions, Bayesian inference using Markov chain Monte Carlo. Introduction to regression and linear models. Computers are used for calculations, simulations, and analysis of data.

After MATH 118 or 120.

### STAT 241a/STAT 541a/MATH 241a Probability Theory with Applications

Introduction to probability theory. Topics include probability spaces, random variables, expectations and probabilities, conditional probability, independence, discrete and continuous distributions, central limit theorem, Markov chains, and probabilistic modeling.

### STAT 262 Computational Tools for Data Science

Introduction to the core ideas and principles of modern data analysis, bridging statistics and computer science, and providing tools for changing methods and techniques. Topics include principle component analysis, independent component analysis, dictionary learning, neural networks, clustering, streaming algorithms (streaming linear algebra techniques), online learning, large scale optimization, simple database manipulation, and implementations of systems on distributed computing infrastructures. Students require background in linear algebra, multivariable calculus, and programming.

### STAT 312a / STAT 612a Linear Models

The geometry of least squares; distribution theory for normal errors; regression, analysis of variance, and designed experiments; numerical algorithms, with particular reference to the R statistical language.

After STAT 242 and MATH 222 or 225.

No final exam.

### STAT 325a / STAT 625a Case Studies

Statistical analysis of a variety of statistical problems using real data. Emphasis on methods of choosing data, acquiring data, assessing data quality, and the issues posed by extremely large data sets. Extensive computations using R. Limited enrollment, by permission only.

### STAT 480ab Individual Studies

Directed individual study for qualified students who wish to investigate an area of statistics not covered in regular courses. A student must be sponsored by a faculty member who sets the requirements and meets regularly with the student. Enrollment requires a written plan of study approved by the faculty adviser and the director of undergraduate studies.

Permission required. No final Exam.

**Graduate**

### STAT 101a - 106a / STAT 501a - 506a Introduction to Statistics

A basic introduction to statistics, including numerical and graphical summaries of data, probability, hypothesis testing, confidence intervals, and regression. Each course focuses on applications to a particular field of study and is taught jointly by two instructors, one specializing in statistics and the other in the relevant area of application. The first seven weeks of classes are attended by all students in STAT 101-106 together, as general concepts and methods of statistics are developed. The remaining weeks are divided into field-specific sections that develop the concepts with examples and applications. Computers are used for data analysis. These courses are alternatives; they do not form a sequence and only one may be taken for credit. No prerequisites beyond high school algebra. May not be taken after STAT 100 or 109.

Students enrolled in STAT 101-106 who wish to change to STAT 109, or those enrolled in STAT 109 who wish to change to STAT 101-106, must submit a course change notice, signed by the instructor, to their residential college dean by Friday, September 27. The approval of the Committee on Honors and Academic Standing is not required.

### STAT 101a / E&EB 210a / E&EB 510a Introduction to Statistics: Life Sciences

Statistical and probabilistic analysis of biological problems presented with a unified foundation in basic statistical theory. Problems are drawn from genetics, ecology, epidemiology, and bioinformatics.

### STAT 102a / STAT 502a / EP&E 203a / PLSC 425a Introduction to Statistics: Political Science

Statistical analysis of politics and quantitative assessments of public policies. Problems presented with reference to a wide array of examples: public opinion, campaign finance, racially motivated crime, and health policy.

### STAT 103a / STAT 503a / SOCY 119a Introduction to Statistics: Social Sciences

Descriptive and inferential statistics applied to analysis of data from the social sciences. Introduction of concepts and skills for understanding and conducting quantitative research.

### STAT 105a / STAT 505a Introduction to Statistics: Medicine

Statistical methods used in medicine and medical research. Practice in reading medical literature competently and critically, as well as practical experience performing statistical analysis of medical data.

### STAT 230a / STAT 530a/ PLSC 530a Introductory Data Analysis

Survey of statistical methods: plots, transformations, regression, analysis of variance, clustering, principal components, contingency tables, and time series analysis. The R computing language and Web data sources are used.

### STAT 238a / STAT 538a Probability and Statistics

Fundamental principles and techniques of probabilistic thinking, statistical modeling, and data analysis. Essentials of probability, including conditional probability, random variables, distributions, law of large numbers, central limit theorem, and Markov chains. Statistical inference with emphasis on the Bayesian approach: parameter estimation, likelihood, prior and posterior distributions, Bayesian inference using Markov chain Monte Carlo. Introduction to regression and linear models. Computers are used for calculations, simulations, and analysis of data.

After MATH 118 or 120.

### STAT 241a/STAT 541a/MATH 241a Probability Theory with Applications

Introduction to probability theory. Topics include probability spaces, random variables, expectations and probabilities, conditional probability, independence, discrete and continuous distributions, central limit theorem, Markov chains, and probabilistic modeling.

### STAT 262 Computational Tools for Data Science

Introduction to the core ideas and principles of modern data analysis, bridging statistics and computer science, and providing tools for changing methods and techniques. Topics include principle component analysis, independent component analysis, dictionary learning, neural networks, clustering, streaming algorithms (streaming linear algebra techniques), online learning, large scale optimization, simple database manipulation, and implementations of systems on distributed computing infrastructures. Students require background in linear algebra, multivariable calculus, and programming.

### STAT 312a / STAT 612a Linear Models

The geometry of least squares; distribution theory for normal errors; regression, analysis of variance, and designed experiments; numerical algorithms, with particular reference to the R statistical language.

After STAT 242 and MATH 222 or 225.

No final exam.

### STAT 325a / STAT 625a Case Studies

Statistical analysis of a variety of statistical problems using real data. Emphasis on methods of choosing data, acquiring data, assessing data quality, and the issues posed by extremely large data sets. Extensive computations using R. Limited enrollment, by permission only.

### STAT 610a Statistical Inference

A systematic development of the mathematical theory of statistical inference covering methods of estimation, hypothesis testing, and confidence intervals. An introduction to statistical decision theory. Undergraduate probability at the level of Statistics 241a assumed.

### STAT 627ab Statistical Consulting

Statistical consulting and collaborative research projects often require statisticians to explore new topics outside their area of expertise. This course exposes students to real problems, requiring them to draw on their expertise in probability, statistics, and data analysis. Students complete the course with individual projects supervised jointly by faculty outside the department and by one of the instructors. Students enroll for both terms and receive one credit at the end of the year.

### STAT 667/ENAS503/AMTH605 Probabilistic Networks, Algorithms, and Applications

This course examines probabilistic and computational methods for the statistical modeling of complex data. The emphasis is on the unifying framework provided by graphical models, a formalism that merges aspects of graph theory and probability theory. Graphical models: Markov random fields, Bayesian networks, and factor graphs. Algorithms: filtering, smoothing, belief-propagation, sum-product, and junction tree. Variational techniques: mean-field and convex relaxations. Markov processes on graphs: MCMC, factored HMMs, and Glauber dynamics. Some statistical physics techniques: cavity and replica methods. Applications to error-correcting codes, computer vision, bio-informatics, and combinatorial optimization.

### STAT 675 Topological Data Analysis

An introduction to a method of topological data analysis called persistent homology. Persistent homology is a framework for extracting certain topological information (connected components, loops, voids, …) from data and can be used to estimate properties of the underlying structures. Various theoretical, methodological, computational, and applied aspects of persistent homology will be discussed

### STAT 690ab Independent Study

By arrangement with faculty. Approval of DGS required.

### STAT 699ab Research Seminar in Probability

Continuation of the Yale Probability Group Seminar. Student and faculty explanations of current research in areas such as random graph theory, spectral graph theory, Markov chains on graphs, and the objective method.

Credit only with the explicit permission of the seminar organizers.

### STAT 700ab Departmental Seminar

Important activity for all members of the department. See webpage for weekly seminar announcements.