If you have questions about the undergraduate Certificate in Data Science, please check the FAQ we are developing. If you do not find your answer there, you may consult the Certificate Coordinator, Jay Emerson (john.emerson@yale.edu).

**Advising: January 19, 11-12 and January 20, 3-4; Zoom ID 991 5925 7410**

Follow the instructions on this page to register for the certificate.

Description of the certificate:

The certificate in Data Science is available to the Class of 2020 and beyond. It requires 5 course credits:

- Probability and Statistical Theory: one of S&DS 238, 240, 241, 242. Advanced students may substitute S&DS 351 or 364 or EENG 431.
- Statistical Methodology and Data Analysis: two of S&DS 230, 242, 312, 361, 363, PLSC 349. Econ 136 may be substituted for S&DS 242.
- Computation & Machine Learning: one of S&DS 262, 265, 317, 355, 365, CPSC 223, CPSC 477, PHYS 378, PLSC 468. CPSC 323 may be substituted for CPSC 223.
- A credit of data analysis in a discipline area. This can be either of:
- Two of the 1⁄2-credit seminars (S&DS 170, 171 and 172) that accompanied S&DS 123 in Spring 2019. (S&DS 171 and 172 are now offered as full-credit courses, so either course can be used on its own to satisfy this requirement if taken in Spring 2020 or later.)
- One of the “Data Science in a Discipline Area” courses approved for the data science certificate, which are listed below.

Students are required to earn at least B- in each course counted towards the certificate (or Pass for courses taken in Spring 2020). No course may be used to fulfill more than one requirement of the certificate. Also, no course may be counted towards both the certificate and a major.

Students are encouraged to take an introductory course, such as one of S&DS 100, 10X, 123 or 220 (or an introductory data analysis course in another department), before taking courses for the certificate. This is described as the “prerequisite” in the YCPS.

The “Data Science in a Discipline Area” courses for the data science certificate are courses that expose students to how data are gathered and used within a discipline outside of S&DS. The courses currently approved for this purpose are:

- ANTH 376 (Observing and Measuring Behavior)
- ASTR 255 (Research Methods in Astrophysics)
- ASTR 330 (Scientific Computing in Astrophysics)
- ASTR 356 (Astrostatistics and Data Mining)
- ECON 438 (Applied Econometrics: Politics, Sports, Microeconomics)
- ECON 439 (Applied Econometrics: Macroeconomic and Finance Forecasting)
- EVST 362 (Observing Earth from Space)
- GLBL 191 (Research Design and Survey Analysis)
- LING 227 (Language and Computation I)
- LING 229 (Language and Computation II)
- LING 234 (Quantitative Linguistics)
- LING 380 (Neural Networks and Language)
- MB&B 452 / MCDB 452 / S&DS 352 (Biomedical Data Science, Mining and Modeling)
- MGMT 595 (Quantitative Investing)
- PLSC 340 / S&DS 315 (Measuring Impact and Opinion Change)
- PLSC 341 / GLBL 195 (Logic of Randomized Experiments in Political Science)
- PLSC 438 (Applied Quantitative Research Design)
- PLSC 454 (Data Science for Politics and Policy)
- PSYC 235 (Research Methods in Psychology)
- PSYC 238 (Research Methods in Decision Making and Happiness)
- PSYC 258 / NSCI 258 (Computational Methods in Human Neuroscience)
- PSYC 438 / NSCI 441 (Computational Models of Human Behavior)
- S&DS 171 (YData: Text Data Science: An Introduction) if taken in Spring 2020 or later
- S&DS 172 (YData: Data Science for Political Campaigns) if taken in Spring 2020 or later
- S&DS 173 (YData: Analysis of Baseball Data) if taken in Spring 2020 or later
- S&DS 174 (YData: Statistics in the Media)
- S&DS 175 (YData: Measuring Culture)
- S&DS 176 (YData: Humanities Data Mining)
- S&DS 177 (YData: Covid-19 Behavorial Impacts)

We’re open to adding more courses to this list (to suggest a course, email john.emerson@yale.edu). Courses in this category should expose students to how data is gathered and used within a discipline. They should not be introductory statistics or probability courses within that discipline, nor should they be courses that focus on statistical methods for analyzing data that has already been cleaned. They should be courses that teach students about the use of data within the domain, including issues of data collection and handling messy data. Examples of courses that might be terrific courses but do not satisfy the requirements of the “Data Science in a Discipline Area” include: BENG 449, BIS 633, CPSE 150, CPSE 477, EENG 439, EP&E 336, FES 611, GLBL 550, PLSC 349, SOCY 133, …

### Restrictions, suggestions, and caveats:

- The department recommends that most students take a 100-level course (some may take 220), followed by 238 or 240, 230, and one of 361 or 363.
- Students considering majoring in Statistics and Data Science should be very careful about which courses they take. Some courses that count towards the certificate (right now 240) do NOT count towards the major.
- Students may not count courses toward both their major and the certificate. If a course in the certificate is required by a student’s major, then the student should substitute a different course in the certificate.
- S&DS Majors may not pursue the Data Science certificate.