Certificate in Data Science

The Certificate in Data Science is designed for students majoring in disciplines other than Statistics & Data Science to acquire the knowledge to promote mature use of data analysis throughout society. Students gain the necessary knowledge base and useful skills to tackle real-world data analysis challenges. Students who complete the requirements for the certificate are prepared to engage in data analysis in the humanities, social sciences, and sciences and engineering and are able to manage and investigate quantitative data research and report on that data.

If you have questions about the undergraduate Certificate in Data Science, please check the FAQ we are developing.  If you do not find your answer there, you may consult the Certificate Coordinator, Jay Emerson (john.emerson@yale.edu).

Follow the instructions on this page to register for the certificate; your Degree Audit may answer some of your questions.  Note that some classes may not be listed in the registration form, and that’s fine – those dropdowns serve no real purpose now that Degree Audit has been deployed.  The same form can also be used to un-register.

The certificate in Data Science requires 5 course credits:

  1. Probability and Statistical Theory: one of S&DS 238, 240 (note the semester change, 2024-25), 241, 242.  Advanced students may substitute S&DS 351 or 364 or EENG 431.
  2. Statistical Methodology and Data Analysis: two of S&DS 220 or 230 (but not both 220 and 230), 242, 312, 361, 363, PLSC 349.  Econ 136 may be substituted for S&DS 242.
  3. Computation & Machine Learning: one of S&DS 262, 265, 317, 355, 365, CPSC 223, CPSC 477, CPSC 381, PHYS 378, PLSC 468.  CPSC 323 may be substituted for CPSC 223.
  4. A credit of data analysis in a discipline area. This can be either of:
    1. Two of the 1⁄2-credit seminars (S&DS 170, 171 and 172) that accompanied S&DS 123 in Spring 2019. (S&DS 171 and 172 are now offered as full-credit courses, so either course can be used on its own to satisfy this requirement if taken in Spring 2020 or later.)
    2. One of the “Data Science in a Discipline Area” courses approved for the data science certificate, which are listed below.

Students are required to earn at least B- in each course counted towards the certificate (or Pass for courses taken in Spring 2020). No course may be used to fulfill more than one requirement of the certificate.  Also, no course may be counted towards both the certificate and a major.

Students are encouraged to take an introductory course, such as one of S&DS 100, 10X, or 123 (or an introductory data analysis course in another department), before taking courses for the certificate.  This is described as the “prerequisite” in the YCPS.

The “Data Science in a Discipline Area” courses for the data science certificate are courses that expose students to how data are gathered and used within a discipline outside of S&DS. The courses currently approved for this purpose are:

  • ANTH 376 (Observing and Measuring Behavior)
  • ASTR 255 (Research Methods in Astrophysics)
  • ASTR 330 (Scientific Computing in Astrophysics)
  • ASTR 356 (Astrostatistics and Data Mining)
  • BENG 469 (Single-cell Biologies, Technologies, and Analysis)
  • ECON 438 (Applied Econometrics: Politics, Sports, Microeconomics)
  • ECON 439 (Applied Econometrics: Macroeconomic and Finance Forecasting)
  • EVST 290 (Geographic Information Systems)
  • EVST 362 (Observing Earth from Space)
  • GLBL 191 (Research Design and Survey Analysis)
  • LING 227 (Language and Computation I)
  • LING 229 (Language and Computation II)
  • LING 234 (Quantitative Linguistics)
  • LING 380 (Neural Networks and Language)
  • MB&B 452 / MCDB 452 / S&DS 352 (Biomedical Data Science, Mining and Modeling)
  • MGMT 595 (Quantitative Investing)
  • PLSC 340 / S&DS 315 (Measuring Impact and Opinion Change)
  • PLSC 341 / GLBL 195 (Logic of Randomized Experiments in Political Science) 
  • PLSC 438 (Applied Quantitative Research Design)
  • PLSC 454 (Data Science for Politics and Policy)
  • PSYC 235 (Research Methods in Psychology)
  • PSYC 238 (Research Methods in Decision Making and Happiness)
  • PSYC 258 / NSCI 258 (Computational Methods in Human Neuroscience)
  • PSYC 438 / NSCI 441 (Computational Models of Human Behavior)
  • S&DS 171 (YData: Text Data Science: An Introduction) if taken in Spring 2020 or later
  • S&DS 172 (YData: Data Science for Political Campaigns) if taken in Spring 2020 or later
  • S&DS 173 (YData: Analysis of Baseball Data) if taken in Spring 2020 or later
  • S&DS 174 (YData: Statistics in the Media)
  • S&DS 175 (YData: Measuring Culture)
  • S&DS 176 (YData: Humanities Data Mining)
  • S&DS 177 (YData: Covid-19 Behavorial Impacts)
  • S&DS 178 (YData: Sociogenomics)
  • S&DS 179 (Ydata: Data Science Applications in Insurance – new course Spring 2024)
  • S&DS 280 (Neural Data Analysis)

We’re open to adding more courses to this list (to suggest a course, email john.emerson@yale.edu). Courses in this category should expose students to how data is gathered and used within a discipline. They should not be introductory statistics or probability courses within that discipline, nor should they be courses that focus on statistical methods for analyzing data that has already been cleaned. They should be courses that teach students about the use of data within the domain, including issues of data collection and handling messy data.  Examples of courses that might be terrific courses but do not satisfy the requirements of the “Data Science in a Discipline Area” include: BENG 449, BIS 633, CPSC 150, CPSC 477, EENG 439, EP&E 336, FES 611, GLBL 550, PLSC 349, SOCY 133, …

Restrictions, suggestions, and caveats:

  1. The department recommends that most students take a 100-level course (some may take 220), followed by 238 or 240, 230, and one of 361 or 363.
  2. Students considering majoring in Statistics and Data Science should be very careful about which courses they take.  Some courses that count towards the certificate (right now 240) do NOT count towards the major.
  3. Students may not count courses toward both their major and the certificate.  If a course in the certificate is required by a student’s major, then the student should substitute a different course in the certificate.
  4. S&DS Majors may not pursue the Data Science certificate.