Certificate in Data Science

If you have questions about the undergraduate Certificate in Data Science, please check the FAQ we are developing.  If you do not find your answer there, you may ask a question using this form.

Description of the certificate:

The certificate in Data Science is available to the Class of 2020 and beyond. It requires 5 course credits:

  1. Probability and Statistical Theory: one of S&DS 238, 240, 241, 242.
  2. Statistical methodology and data analysis: two of S&DS 230, 242, 312, 361, 363.
  3. Computation & Machine Learning: one of S&DS 262, 355, 365.
  4. A credit of data analysis in a discipline area. This can be either:
    1. Two of the 1⁄2-credit seminars that will accompany S&DS 123.
    2. One of the “Data Science in a Discipline Area” courses approved for the data science certificate.

Students are required to earn at least B- in each course counted towards the certificate.

No course may be used to fulfill more than one requirement. A prerequisite is an introductory course: one of S&DS 100, 10X, 123 or 220.

The “Data Science in a Discipline Area” courses for the data science certificate are courses that expose students to how data are gathered and used within a discipline outside of S&DS. The courses currently approved for this purpose are:

  • PSYC 235 (Research Methods in Psychology)
  • PSYC 258 (Computational Methods in Human Neuroscience)
  • PLSC 454 (Data Science for Politics and Policy)
  • ANTH 376 (Observing and Measuring Behavior)
  • EVST 362 (Observing Earth from Space)
  • GLBL 191 (Research Design and Survey Analysis)
  • GLBL 195/PLSC 341 (Logic of Randomized Experiments in Political Science) 
  • LING 227 (Language and Computation I)
  • LING 229 (Language and Computation II)
  • LING 234 (Quantitative Linguistics using Corpora)
  • LING 380 (Neural Networks and Language)
  • ASTR 356 (Astrostatistics and Data Mining)

The department is planning to expand the list of courses above to more disciplines.

Restrictions, suggestions, and caveats:

  1. The department recommends that most students take a 100-level course, followed by 238, 230 and one of 361 or 363. Students who take 220 should NOT take 230, and should instead take 361 and then another course in Data Analysis (363 or 312).
  2. S&DS 355 does not presently exist. The department plans to split the Machine Learning course, S&DS 365 into two tracks. After it does so, the course S&DS 355 will have fewer mathematical prerequisites, and so will be accessible to students completing the certificate. S&DS 365 will require Linear Algebra, and will be intended for S&DS majors.
  3. S&DS 240 does not presently exist. The department plans to split the Probability course, S&DS 241 into two tracks. After it does so, the course S&DS 240 will have fewer mathematical prerequisites, and so will be accessible to students completing the certificate. S&DS 241requires Multivariate Calculus and Linear Algebra, and will be intended for S&DS majors.  S&DS 240 does NOT counts towards the major in Statistics & Data Science.
  4. At present, three seminars are being designed to accompany S&DS 123. These are in the fields of Astronomy (S&DS 170), Political Science (S&DS 172), and Text Analysis (S&DS 171). The department expects that more will be created in 2019-2020.
  5. Students may not count courses toward both their major and the certificate.  The one exception is when one of the courses in the certificate is required by another major. In this case, the department suggests that the student  take both the course in the certificate, and a more advanced course in their major. If the course that overlaps with their major is one from requirement 5 above, then the student should satisfy requirement 5 with a course outside their major.
  6. S&DS Majors may not pursue the Data Science certificate.