The requirements for the major can be found at YCPS Statistics and Data Science.
Please use this S&DS checklist to organize your course selections.
You can receive updates on the major by subscribing to the S&DS undergraduate student mailing list.
A. Core Probability and Statistics: These are essential courses in probability and statistics. Every major should take at least two of these courses, and should probably take more. Students completing the BS must take S&DS 242.
- S&DS 238 (Probability and Statistics)
- S&DS 241 (Probability Theory)
- S&DS 242 (Theory of Statistics)
- S&DS 312 (Linear Models)
- S&DS 351 (Stochastic Processes)
B. Computational Skills: Every student in Data Science should be able to compute with data. While the main purpose of some of these courses is not computing, students who have taken at least two of these courses should be capable of digesting and processing data. While there are other courses that require a lot of programming, these ones are essential.
- S&DS 220 (Intro Statistics, Intensive) or S&DS 230 (Data Exploration and Analysis)
- CPSC 100 or 112, or ENAS 130. Substitution of CPSC 201 or 223 permitted.
- S&DS 425 (Statistical Case Studies)
- S&DS 262 (Computational Tools for Data Science)
C. Methods of Data Science: These courses teach fundamental methods for dealing with data. They range from the practical to the theoretical. Every major must take at least two of these courses.
- S&DS 312 (Linear Models)
- S&DS 361 (Data Analysis)
- S&DS 363 (Multivariate Statistics for Social Sciences)
- S&DS 365 (Applied Data Mining and Machine Learning)
- CPSC 477 (Natural Language Processing)
- CPSC 663 (Deep Learning Theory and Applications)
- S&DS 430 (Optimization Techniques)
- S&DS 468 (Nonparametric Learning)
- EENG 400 (Dynamic and Discrete Optimization)
D. Mathematical Foundations and Theory: All students in the major must know linear algebra. If they have learned linear algebra through other courses (such as MATH 230/231), they may substitute another course from this category. Students pursuing the B.S. must take at least two courses from this list. Students who wish to pursue graduate school should take many.
- CPSC 365 (Algorithms) or CPSC 366 (Intensive Algorithms)
- CPSC 469 (Randomized Algorithms)
- MATH 222 (Linear Algebra with Applications)
- MATH 225 (Linear Algebra and Matrix Theory)
- MATH 244 (Discrete Mathematics)
- MATH 260 (Basic Analysis in Function Spaces)
- MATH 300 (Topics in Analysis) or MATH 301 (Introduction to Analysis)
- S&DS 364 (Information Theory)
- S&DS 400 (Advanced Probability)
- S&DS 410 (Statistical Inference)
- S&DS 411 (Selected Topics in Statistical Decision Theory)
- S&DS 669 (Statistical Learning Theory)
E. Efficient Computation and Big Data: These courses are for students who want to do serious programming or implement large-scale analyses. None are required for the major. Students who wish to work in the software industry should take at least one of these.
- CPSC 223 (Data Structures and Programming Techniques)
- CPSC 323 (Introduction to Systems Programming and Computer Organization)
- CPSC 437 (Introduction to Databases)
- CPSC 424 (Parallel Programming Techniques)
F. Data Science in Context: Students are encouraged to take courses that involve the study of data in application areas. These courses will teach students how these data are obtained, how reliable they are, how they are used, and the types of inferences that can be made from them. These course selections should be approved by the DUS. Examples of such courses include
- PSYC 235 (Research Methods in Psychology)
- PSYC 258 (Computational Methods in Human Neuroscience)
- PLSC 454 (Data Science for Politics and Policy)
- ANTH 376 (Observing and Measuring Behavior)
- EVST 362 (Observing Earth from Space)
- GLBL 191 (Research Design and Survey Analysis)
- GLBL 195/PLSC 341 (Logic of Randomized Experiments in Political Science)
- LING 229 (Language and Computation II)
- LING 234 (Quantitative Linguistics using Corpora)
- LING 380 (Neural Networks and Language)
G. Methods in Application Areas: These are methods courses in areas of applications. They help expose students to the cultures of fields that explore data. These course selections should be approved by the DUS. Examples of such courses include:
- S&DS 352/MB&B 452 (Biological Data Science, Mining and Modeling)
- EENG 445 (Biomedical Image Processing and Analysis)
- ENAS 962 (Theoretical Challenges in Network Science)
- ECON 136 (Econometrics)
- ECON 420 (Applied Microeconometrics)
- CPSC 453 (Machine Learning for Biology)
- CPSC 470 (Artificial Intelligence)
- CPSC 475 (Computational Vision & Biological Perception)
- LING 227/PSCY 327 (Language and Computation I)