S&DS Major FAQ

Frequently Asked Questions are below. These aren’t currently hyperlinked, but if you see a question you’d like to see the answer to, you can CTRL+F or CMD+F and search for the question, or simply scroll down to the section it is in. Or, you may prefer to use this Google Doc, which has an outline with hyperlinks.

 

Front matter

Who is the S&DS major for?

What’s the difference between statistics and data science?

What’s the difference between data science and computer science? Is machine learning considered statistics, data science, or computer science?

Logistics

Is there an undergraduate student mailing list?

How do I declare S&DS as my major? How do I switch my major to S&DS?

How do I double major with S&DS?

Can a S&DS Major also receive the Certificate in Data Science?

Major requirements

Which courses do I need to take to receive a degree in S&DS?

Is it too late to pursue a degree in S&DS?

How do I know whether a B.A. or B.S. is right for me?

How do I know whether a B.A. or certificate in data science is right for me?

How can I keep track of my major requirements?

What courses can be taken Cr/D/F?

How do I waive a course requirement?

How do I earn distinction in the major?

Choosing courses

I am a new to the major. What courses should I start with?

I took S&DS 10x and am now thinking of majoring in S&DS. What should I take next?

I took S&DS 123 and am now thinking of majoring in S&DS. What should I take next?

Do I have the right background to take a course? Can I take a course if I haven’t taken the listed prerequisites? Can I take a prerequisite course concurrently?

How do I choose electives for the major?

I am interested in machine learning. What courses should I consider?

I am interested in natural language processing. What courses should I consider?

I am interested in causal inference and randomized experiments. What courses should I take?

I am interested in psychology and statistics. What courses should I take?

I am interested in economic research and modeling. What courses should I take?

I am interested in politics and policy. What courses should I take?

I am interested in data visualization. What courses should I take?

Where can I find a complete list of courses that have been approved for Category F (Data Science in Context) and Category G (Methods in Application Areas)?

Specific courses in the department

What is the difference between Math 222, 225, 226, and 230/231?

Can S&DS 220 and 230 both be taken and counted towards the major?

What’s the difference between S&DS 238 and 241? Are both sufficient for 242 and 351?

What’s the difference between S&DS 240 and 238/241?

Are S&DS 240 and S&DS 355 accepted for the major?

What is the senior case studies course (S&DS 425) like?

Combined B.S./M.A. in S&DS

Approximately how many students earn a B.S./M.A. in S&DS each year?

Where can I learn more about the combined B.S./M.A. in S&DS?

If a course has both an undergraduate and a graduate number associated with it, which number should I enroll in?

How many courses are required for the B.S./M.A.?

What courses are acceptable for the M.A. Degree, and when should I take them?

Programming

How much programming will I learn?

Will I learn more R or Python in the major?

Are there any programming prerequisites to the major?

Advising

How do I speak with the DUS?

I am a sophomore looking for a college advisor in S&DS. How do I find one?

Where can I speak to upperclassmen peers about their experiences in the major?

Getting involved with the department

What are the responsibilities of undergraduate learning assistants (ULAs)?

How do I become a ULA?

I think the S&DS department should teach a course on dark magic, hire a professor from Hogwarts, or buy an inflatable unicorn for Cross Campus. Where do I go?

What does the departmental student advisory committee (DSAC) do?

How do I join the DSAC?

Where can I find research opportunities in statistics & data science?

S&DS 491/492 Senior Essay

Advisor

Project topic

Data Science Project Match

Proposal

Data

Poster and Report

491/492 FAQs

What are the options for the S&DS senior requirement?

How do I choose a senior project (S&DS 491-492)?

Who do I choose to advise my senior project?

What are the deliverables and requirements for the senior project?

When should I complete my senior project?

Is there a difference in the senior project I have to complete for the B.S. and B.A.?

 

Front matter

Who is the S&DS major for?

Anyone! No, seriously.

The S&DS major is designed to be extremely flexible, in part to reflect the broad scope of the field and the ever-expanding skill set that falls under the realm of data science. Courses in the S&DS major span a wide array of subdisciplines and subtopics, and students in our department span a wide array of interests and backgrounds. No matter where your interests lie, the tools that come with analyzing, interpreting, and presenting empirical data will be useful.

What’s the difference between statistics and data science?

This is a good question, and it’s one that nobody knows the answer to (seriously, ask 5 people what data science is and you won’t get a consistent answer). Broadly speaking, “statistics” more typically refers to the question inference: how can we estimate an uncertain quantity, and how uncertain are we? And broadly speaking, data science more typically refers to the entire process of working with data; in some ways, it is applied statistics. Data science is a newer term, but the techniques that comprise data science are not necessarily newer.

Data science can include all of the following, depending on who you ask:

  • Data collection (including web scraping)

  • Data analysis & analytics

  • Data manipulation & wrangling

  • Data visualization

  • Data engineering

  • Statistical inference and modeling

  • Machine learning and artificial intelligence

  • Randomized experiments and causal inference

  • Natural language processing

  • Spatial statistics and GIS

Much of modern machine learning and artificial intelligence builds on techniques that have traditionally been known under the realm of statistics: the underlying math of logistic regression (one of the most useful foundational machine learning models) has existed for decades under the umbrella of “statistics,” before it was repurposed under what many will consider “data science.”

What’s the difference between data science and computer science? Is machine learning considered statistics, data science, or computer science?

Like the previous question, there’s no good answer to this question either. After all, modern “machine learning” draws heavily from foundations in statistical theory, yet industry roles that are centered on machine learning might fall under both the “data scientist” or “software engineer” job titles. Yale’s key gateway machine learning course is offered by the S&DS department (S&DS 365), while some (perhaps even most) schools offer machine learning under the Computer Science department. No matter what department it is offered under, every solid machine learning course will emphasize statistical theory.

It’s probably most fair to say that statistics, data science, and computer science borrow heavily, and symbiotically, from each other. Machine learning methods rely on statistical theory in concept, but training large machine learning models relies on methods of efficient computing, where research traditionally falls in the realm of computer science. And certain subsets of data science, including data visualization and data engineering, rely on techniques that are sometimes taught in computer science courses (and have traditionally been claimed by the field of computer science as research topics).

Logistics

Is there an undergraduate student mailing list?

Yes, you can receive updates on the major by subscribing to the S&DS undergraduate student mailing list. The mailing list includes weekly DUS updates, announcements for new courses, and job and research opportunities. If you are somewhat seriously considering the S&DS major, even if you are not officially declared, we recommend you sign up, as it can be a useful source of information.

Note that after you sign up you’ll be asked to confirm —  if you don’t confirm, you won’t be added to the mailing list.

How do I declare S&DS as my major? How do I switch my major to S&DS?

See this page for info on declaring a major. 

How do I double major with S&DS?

You should first talk to your residential college dean. Once you’ve done so, work with your DUS to develop a proposed course schedule using this checklist

You will also need to complete the Petition to Complete the Requirements of Two Majors online form. See the bottom of this page. You’ll be asked to list the courses you intend to use to fulfill the requirements of both majors, but note that these choices are not binding.

If you plan to double major, be sure to speak to the DUS’s of both S&DS and your second major to ensure that you will be on track to complete both sets of requirements. Remember that only two courses can overlap between the majors. Also, note that the senior requirements can not be double-counted. You must take S&DS 491 or 492 and fulfill the senior requirement for your other major. 

Can a S&DS Major also receive the Certificate in Data Science?

No. The major requirements are more in-depth than those of the certificate and expand on the skills and knowledge you would gain in the certificate.

Major requirements

Which courses do I need to take to receive a degree in S&DS?

The Yale College Program of Studies, the Major at a Glance page, and S&DS checklist should always serve as the most authoritative sources of information about requirements for the major. If you have questions about the structure and requirements of the major, you should consult one of these resources, or email your questions to the DUS.

We offer two degrees in Statistics & Data Science: a Bachelor of Arts and Bachelor or Science. The key difference is that the B.S. requires three more courses: the B.S. requires 14 courses, while the B.A. requires 11 courses. Note that both degrees require multivariable calculus as a prerequisite, which can be fulfilled through Math 120 or ENAS 151, or be waived by the DUS.

The B.A. in S&DS requires 11 term courses:

  • Linear algebra (either Math 222, 225, or 226)

  • 2 courses from Category A (S&DS 241 is highly recommended but not required)

  • 2 courses from Category B

  • 2 courses from Category C

  • 3 electives from Categories A-G

  • Senior requirement (S&DS 491 or 492)

The B.S. in S&DS requires 3 additional term courses, for a total of 14 term courses:

  • 1 additional course from Category D

  • 2 additional courses from Categories A-E

  • One of your B.A. Category A courses must be S&DS 242 and one of your B.A. Category C courses must be S&DS 365.

Introductory courses number in the 100s, including S&DS 100-109 and S&DS 123, do not count towards the major (with the exception of S&DS 150, Data Science Ethics, which may count on a case-by-case basis, conditional on DUS approval).

Is it too late to pursue a degree in S&DS?

Generally, the answer is no.

If you are currently a first-year or a sophomore (in either semester), you have more than enough time to complete all of the requirements of the major. You should take S&DS 241 (Probability Theory), which is offered in the fall, as soon as possible, because it’s a prerequisite for many other courses. You should also plan to take S&DS 242 (Theory of Statistics) if you’re interested in pursuing the B.S., or plan to take high-level courses, as soon as possible. Many students take the 241/242 sequence in their sophomore year, although some students do so in their junior year. 

If you’re a junior, you should be fine as long as you are taking S&DS 241 no later than junior fall, if you haven’t already done so as part of your requirements for another major. 

The S&DS degree is extremely flexible, and you can use courses from many different departments — including Mathematics, Computer Science, and Economics — to fulfill your elective requirements. If you come from one of these disciplines, you may find that you have already met some or many of the requirements of the S&DS degree. We encourage you to use the S&DS major checklist to map out which requirements you’ve already satisfied and which courses you would still need.

If you are in the middle of your junior year and still want to switch to S&DS, it will be difficult to do so without having taken S&DS 241. But depending on your mathematical background (perhaps you’ve gained a good amount of math and statistics from another field), it may be worth speaking to the DUS about your situation, and whether paths might still be open. In any case, the certificate is an option.

How do I know whether a B.A. or B.S. is right for me?

Aim for the B.S. and if you realize that you don’t have the room to complete the B.S. requirements, or there are other non-S&DS courses you would rather take as you finish up Yale, you can always choose to pursue the B.A. instead. Double majors can decide based on the course requirements of their other major and other considerations. 

How do I know whether a B.A. or certificate in data science is right for me?

Generally speaking, pursuing the B.A. will give you a stronger foundation in the mathematical grounding of statistics and data science. The major will require that you take linear algebra, for instance, and you will gain deeper exposure to some concepts since you will be taking S&DS 241 rather than potentially 240, which can only be used for the certificate. You will also gain more applied data science skills just by virtue of taking more courses.

If you are already majoring in a heavily quantitative field, it may not make a huge difference whether you add on a B.A. in S&DS or a certificate in data science. Fields like Economics, Psychology, Computer Science, Mathematics, and some engineering fields would likely give you plenty of exposure to applied situations. Adding the certificate would give you an opportunity to develop some of your data science skills with more depth and complement your primary field of study without substantially burdening your course schedule. But also note that if you’re majoring in a quantitatively-adjacent field, should you choose to pursue the B.A., you’ll likely be able to double-count 2 courses across your majors, so the 11 required courses for the B.A. becomes 9 in practice, which is only 4 credits more than the 5 required credits for the certificate (since you’re not allowed to double-count certificate courses).

If you are planning on majoring in a completely unrelated subject, such as English or History, and particularly if you have several math requirements to backfill, you may find that even completing the certificate requires you to fill in additional courses beyond the 5 required credits of the certificate. Depending on how comfortable you are with mathematics, you may or may not find the additional course credits for the B.A. worth it, or you might not have the space in your schedule.

There’s no one clear answer, and you should speak to both the DUS and the director of the certificate program to discuss your specific situation. However, if you are at all considering the major, we recommend that you avoid the certificate-only courses (such as 240) and take the major-focused versions (i.e. 241) to ensure you can count them for the major, should you choose to pursue the major.

How can I keep track of my major requirements?

We encourage you to use the S&DS major checklist to keep track of your major requirements. We also recommend that, before finalizing your course schedule each semester, you email your checklist to the DUS to ensure that your course selections meet the requirements of the major.

If you have declared the S&DS major, you can view Degree Audit within Yale Hub to see your completed requirements for the major. Note that some electives may not immediately show up in Degree Audit, as they must be manually added by the DUS.

What courses can be taken Cr/D/F?

No courses may be taken Cr/D/F.

How do I waive a course requirement?

You need to provide the DUS with evidence demonstrating your proficiency in the course material.  For example, if you took a similar course elsewhere, then provide information such as the syllabus, textbook used, exams, problem sets, transcript, etc. In addition, you should list any advanced courses you’ve taken at Yale that depend crucially on the course material.

For courses taken outside of Yale, there are two options for getting credit:

  • Get Yale credit for the course (this is done through your College Dean). Then, if the DUS has approved your course for the major, your total course count goes down by one.

  • Don’t get Yale credit. If I’ve approved the course I’ll waive the appropriate category requirement but you’ll have to take an extra elective to maintain the total course count.

How do I earn distinction in the major?

You must earn a grade of A or A- in three-quarters of the credits you take in the S&DS department, including your senior essay S&DS 491 or 492. Note that the denominator of this calculation includes both (i) courses you take to fulfill the requirements of the S&DS major, including those without a S&DS course code, and (ii) courses you take beyond the 11 (or 14) required courses of the S&DS major that happen to carry a S&DS designation. Marks of “W” for Withdrawal are not included in either the numerator or denominator of this calculation.

In practice, if you complete the B.A. and take no other courses in the S&DS major, you will need to earn a A or A- in 9 out of 11 of those courses. If you complete the B.S. and take no other courses in the S&DS major, you will need to earn an A or A- in 11 out of 14 of those courses. (The Math 120 prerequisite isn’t factored into this calculation.)

 

Choosing courses

I am a new to the major. What courses should I start with?

If you’re reasonably sure that you plan to major in S&DS, there are a few categories of courses you should look at taking:

  • Math courses. The S&DS major requires a course in linear algebra (Math 222/225/226), so depending on your mathematical background, you might just need to take linear algebra, or you might need to take the entire calculus sequence starting from Math 112/115/120. Either way, you should knock out your math sequence as quickly as you can.

  • S&DS 220. If you’re likely to major in S&DS, we generally recommend that you take S&DS 220 as your introductory course, since 100-level intro courses don’t count for the major. S&DS 220 requires no background in statistics or programming, and while it’s time-intensive, it will equip you with a strong set of tools in statistics and programming for the rest of the major. We recommend taking S&DS 220 even if you have prior statistics exposure from AP Statistics because 220 covers a broader range of topics and in much more depth than AP Statistics.

  • S&DS 241 (or 238). Once you’ve taken multivariate calculus (Math 120), you should try to take S&DS 241 as soon as you can. Typically most students take this in their sophomore year. S&DS 241 serves as a prerequisite for math other courses, so taking it early will ensure you have maximum flexibility later on. It also lets you take S&DS 242 if that’s something you plan on doing, either because you want to, as a requirement for the B.S., or as a prerequisite for many other courses. S&DS 241 is the more traditional option, although if S&DS 238 sounds more like your cup of tea, that’s a good alternative as well. See the FAQ below for more details on the difference between S&DS 238 and S&DS 241.

If you plan on majoring in S&DS, it’s also a viable path to take S&DS 100-109/123, followed by 220 or 230, but remember that 100-level courses don’t count toward the major.

I took S&DS 10x and am now thinking of majoring in S&DS. What should I take next?

Take S&DS 220 if you want to learn some of the more statistical theory in more depth, and learn how to run statistical simulations. Take S&DS 230 if you want to get more applied data analysis experience.

If you’re now thinking of majoring in S&DS, you should also consider getting your math requirements out of the way (you’ll need to take all the way up to linear algebra), and you should take S&DS 241 as soon as you can.

I took S&DS 123 and am now thinking of majoring in S&DS. What should I take next?

Take S&DS 230. If you’re now thinking of majoring in S&DS, you should also consider getting your math requirements out of the way (you’ll need to take all the way up to linear algebra), and you should take S&DS 241 as soon as you can.

Do I have the right background to take a course? Can I take a course if I haven’t taken the listed prerequisites? Can I take a prerequisite course concurrently?

Generally, you should contact the instructor and ask if you have a suitable background for the course, whether or not the prerequisites are strict, and whether prerequisites can be taken concurrently.

How do I choose electives for the major?

Once you’ve taken core courses like S&DS 220/230, 241 + 242, and linear algebra, you’re ready to start branching out into different areas of the major. 300-level courses like 312 (Linear Models), 315/317 (causal inference), 351 (Stochastic Processes), 361/363 (more advanced data analysis), 364 (Information Theory), and 365 (intermediate machine learning) will give you exposure to different areas of the major, and help you hone your interests. 

Note that 265 (if you don’t have programming experience, see below) and 365 are particularly useful to take, since they give you the foundational knowledge in machine learning that you would need for many projects, internships, and research opportunities. Machine learning isn’t required of all majors, but many students find the content to be valuable.

Note: Electives must be approved by the DUS. It is recommended that you email your checklist to the DUS (ideally before you enroll in the elective) to ensure that your course selection will count towards the major.

You have a lot of freedom to choose electives in the S&DS major, particularly since the field is so broad, which means that you should put some thought into making the courses you take for your S&DS major somewhat coherent and cohesive. (We promise, you’ll feel like you’ve gotten more out of your major that way.) Once you’re done with core courses for the major, you should try to pick a few things that you’re interested in within the field of statistics and data science, and try to align your studies along that path. You should, of course, give yourself room to pivot if you realize your interests change, and you should also allow yourself to explore and pick courses from several of these categories, but trying to design a cohesive sequence of courses will give you more opportunities to build on previous courses and find connections between material you encounter in different courses.

The next few questions give you examples of courses you can take within different subareas, but by no means are these the only areas that you can explore within the S&DS major! These are just a few ideas to get you started and guide you as you start exploring electives for the major. You might want to talk to professors of courses you’ve taken and peers (specifically peers in their junior or senior year) to get ideas for good courses to take.

I am interested in machine learning. What courses should I consider?

Start out with S&DS 265 if you don’t have programming experience and follow it up with S&DS 365. If you do have programming experience, start with 365. Then, consider courses like CPSC 452, CPSC 453, CPSC 464, and CPSC 477. It’s also worth searching “machine learning” in Yale Course Search to see which specific courses are being taught each semester, as the selection changes frequently. Some of the courses just mentioned above may no longer be offered. 

If you plan on focusing on machine learning and taking courses in the Computer Science department, note that you should also plan to take up to CPSC 223, which is a common prerequisite for higher-level computer science courses. Some courses may require up to CPSC 365.

I am interested in natural language processing. What courses should I consider?

Start out with S&DS 265 if you don’t have programming experience and follow it up with S&DS 365. If you do have programming experience, start with 365. Then, consider courses like CPSC 477, CPSC 488, LING 229, LING 234, and LING 380. Look out for courses taught by Robert Frank in the Linguistics department. John Lafferty in S&DS is also a leading expert in text processing, although in recent semesters he has typically been teaching the machine learning courses.

In preparation for advanced computer science courses, you may also wish to prepare to meet any computer science prerequisites, which is often set at CPSC 223.

I am interested in causal inference and randomized experiments. What courses should I take?

You could start with S&DS 317. S&DS 616 and 617 are other possibilities. PLSC 341, The Logic of Randomized Experiments is another. S&DS-affiliated professors who are experts in causal inference and randomized experiments include Joshua Kalla, Alex Coppock, P Aronow, and Jas Sekhon, among many others. Econometrics courses in the Economics department can also be a good option, such as ECON 136 and ECON 419/420. Check if/when these courses are offered. Offerings have differed based on which faculty are in the department. 

I am interested in psychology and statistics. What courses should I take?

The Psychology and S&DS majors are almost the perfect marriage. Understanding research in psychology requires a solid foundation of statistical methods, while the expanse of psychology makes it a great field to apply the tools of statistics and data science. The Psychology major specifically requires a class in statistics, usually PSYC 200, but it is possible to substitute that requirement with a S&DS class that teaches ANOVA (consult with the DUS of Psychology to confirm eligibility).

Psychology courses that are data analysis heavy include research methods classes (e.g. PSYC 235, 238, 258, 438) and multivariate statistics classes (e.g. S&DS 363, PSYC 518). Statistics courses focused on causal inference and experimental design are also important to psychology; see the previous question for relevant course recommendations. Other S&DS courses to give you more knowledge on the application of statistics include 262, 317, 361, and 363.

I am interested in economic research and modeling. What courses should I take?

The introductory econometrics sequence (ECON 117/123) should give you a strong foundation in the skills you’ll need to conduct economic research. ECON 136 can be helpful for solidifying several more advanced economic techniques, such as panel data, difference-in-difference estimation, and regression discontinuity. Depending on your interests, there are generally an array of courses offered in the Economics department that have a strong empirical focus. ECON 301, 417, 418, 419, 420, 438, and 439 might be a few courses to look at, among others.

I am interested in politics and policy. What courses should I take?

A few options to consider: S&DS 315 taught by Josh Kalla, PLSC 454 taught by Frederik Savje, and PLSC 341 taught by Alex Coppock are three good options. In general, these professors do good applied work at the intersection of politics, policy, and statistical methodology, so they may be useful resources to reach out to in general. You should check course listings for upcoming semester, as these courses change frequently and any given course number may only be offered once every couple of years. 

In addition, you may wish to consult the above advice on causal inference and randomized experiments, as a lot of modern research in public policy is centered around testing causal questions using empirical data from field experiments or past policy implementation. Finally, you might also consider seeking research opportunities with professors in the political science department to assist with applied research questions they’re working on.

I am interested in data visualization. What courses should I take?

CPSC 446 and PLSC 349 (if it’s being offered) are two good courses with a very different focus from each other. S&DS 674 could be useful to take if you are specifically interested in visualizing spatial data and generating maps. You should also take the time to really learn ggplot2, a really valuable R package for creating complex charts and visualizations.

Where can I find a complete list of courses that have been approved for Category F (Data Science in Context) and Category G (Methods in Application Areas)?

These course selections should be approved by the DUS at the start of each semester. You can consult the Major at A Glance page for examples of courses that are often approved for Categories F and G, depending on what other courses are part of your proposed schedule. Remember that electives should build on your S&DS skills and should provide both breadth and depth of topics. Courses with no S&DS prerequisites are often not approved as electives. 

Specific courses in the department

What is the difference between Math 222, 225, 226, and 230/231?

According to the Math department webpage, regarding 222/225/226,  “All three courses cover linear algebra. Math 222 focuses more on computational techniques and applications, while 225 and 226 emphasize mathematical proofs and a more conceptual approach.  Math 225 (linear algebra) or 226 (intensive linear algebra) is recommended for students who wish to take further proof-based mathematics courses.” Starting in Fall 2021, Math 225 will include an explicit introduction to proof-writing, and will be designed to be accessible as a first-semester math course with no prior proof-writing experience. Math 226, new for Fall 2021, is a more intensive version of Math 225 that will not include an explicit introduction to proof-writing, and will explore topics in more depth. It is designed for students who already have prior exposure to proofs.

For most S&DS students interested in applied data science work, taking Math 222 will generally suffice. Taking Math 225 may provide a helpful introduction to proof-writing, which sometimes appears in higher-level, more theoretical courses in statistics and data science (as well as in courses such as CPSC 366, intensive algorithms, if that is a direction you are considering pursuing).

 

Can S&DS 220 and 230 both be taken and counted towards the major?

You can take both if you’d like, but only one of these courses can count towards the major. There may be an alternative to taking both that would make sense for you. Contact your DUS to discuss. 

What’s the difference between S&DS 238 and 241? Are both sufficient for 242 and 351?

S&DS 238 and 241 overlap substantially but are quite different, and choosing one over the other involves tradeoffs. They both formally require multivariate calculus as the prerequisite (either Math 118 or Math 120, or a previous multivariate calculus course).

If you are planning to take further statistics courses such as 242 and 351, then the more standard choice would be 241, and I’d say you can’t go wrong with this choice. S&DS 241 has been taught for a very long time, and the courses 242 and 351 were designed to follow 241.

S&DS 238 was added to our course offerings more recently. Our original question motivating the development of 238 was: for a student who is thinking of taking just one course in the whole area of probability/statistics/data analysis, hoping to learn as much as possible in one semester, what would we teach them? It has turned out that many S&DS 238 students go on to take more statistics (including declaring a statistics major), but that was the original concept.

In contemplating taking S&DS 241, you should definitely not worry about not having taken a statistics course before, and it is not true that 241 is less suitable for people with no prior experience with statistics than 238 is, since they both assume no prior experience with statistics, and just assume some basic familiarity with the tools of multivariate calculus.

What are those tradeoffs? S&DS 241 focuses on probability theory and tends to emphasize mathematical developments more, and S&DS 238 includes a substantial amount of statistics and computing together with some math. That is, typically (of course it varies with different instructors) S&DS 241 feels more like a math class, and S&DS 238 mixes in statistical inference (from a Bayesian viewpoint, which is a bit unusual for a course at this level), computing, and some data analysis. You can expect to get more time and practice and depth with Probability Theory in 241 than in 238. S&DS 238 includes topics that overlap (but from a somewhat different point of view and perhaps for different purposes) with 242 and 351, such as using likelihood for statistical inference (which also is done in 242) and Markov chains (which are also done in 351). From that point of view, students who come out of S&DS 238 and then take 242 or 351 may feel that they are already comfortable with some concepts that others in those classes are seeing for the first time, which could be viewed as an advantage, but they may feel that their command of probability theory is being taxed more than the students coming out of 241, which is a disadvantage.

In 238 you would get enough probability theory that you would be prepared to take 242 and 351, and in this regard differences between how well individual students “got” the respective classes (238 or 241) are probably more important than which class they took, but the median student in 241 probably has a more solid command of probability theory than the median student in 238. The 238 students probably have some additional useful perspectives and insights (as well as skills in computing and simulation) that could help them understand and appreciate some of the things they are about to learn in 242 and 351, and the hope would be if they feel a need to strengthen any particular aspect of probability theory while taking 242 and 351, it would not be a problem to do some review or a bit of extra reading, perhaps in a 241 textbook.

What’s the difference between S&DS 240 and 238/241?

S&DS 240 was added to our course offerings much more recently in tandem with the recent addition of the certificate in data science. S&DS 240 is a less mathematically rigorous treatment of the topics typically taught in a probability theory course like 241, and as such it requires only Math 115 (or AP Calculus BC), rather than Math 118/120 (multivariate calculus). You will learn many of the same broad skills and topics, and in fact any of S&DS 238, 240, or 241 will cover the background needed for S&DS 242 (Theory of Statistics).

The key difference is that 240 cannot be used to fulfill the requirements of the major, since the course is designed for those pursuing the certificate, while 238/241 can be used towards the major. If you are on the fence between pursuing the certificate and the major, you should try to take 241 if your schedule allows (and assuming you meet the prerequisite), since it will give you the flexibility to pursue the major if you decide you want to take more courses in S&DS.

Are S&DS 240 and S&DS 355 accepted for the major?

No. S&DS 240 and 355 are only accepted for the certificate.

What is the senior case studies course (S&DS 425) like?

The case studies course is a guided but independent data analysis practicum — that is, the professor helps you further develop your data analysis skills in a variety of applied situational contexts, filling in gaps that may have been left by earlier data analysis courses. There’s an emphasis on methods of choosing data, acquiring data, assessing data quality, and the issues posed by working with large, and potentially messy, data sets. In other words, while previous courses may have helped you develop your skills in specific areas, such as statistical inference or data visualization, S&DS 425 will expose you to practical considerations across the entire data lifecycle. There’s also an emphasis on developing strong coding habits to ensure that your R scripts are readable and reproducible by both yourself as well as others who may need to use your code. Unlike previous courses, the S&DS 425 professor will serve more in an advisory role rather than planning specific knowledge areas you must master. A good portion of the course will center on independently figuring out how different packages work together, comparing different methodologies to accomplish a certain task (for which there might not be one “best” answer), and how to interpret results in ambiguous contexts.

S&DS 425 is a great course, and past students have said that every student should take this course at some point in their S&DS career. It will make you a better statistician, data analyst, and R programmer. Enrollment is limited and the course has an application process that has typically involved submitting a transcript and statement of interest in the course. Unfortunately, in recent years, due to the rapid growth of the undergraduate S&DS population, the department has only (and just barely) had enough capacity to accommodate one section of undergraduate S&DS majors per semester, with no room for students not in the major.

Conditions may change, but you should speak to the DUS to determine whether the case studies course might be right for you.

Combined B.S./M.A. in S&DS

Approximately how many students earn a B.S./M.A. in S&DS each year?

Generally approximately 2-3 students are accepted, although this can fluctuate from year to year.

Where can I learn more about the combined B.S./M.A. in S&DS?

You should make yourself aware of the pertinent deadlines in the Yale College Program of Studies, then reach out to the DUS for more information as soon as possible. Note that there are deadlines beginning in your fifth term of enrollment at Yale, but you should start planning your courses well in advance, particularly given that many courses in the S&DS department have cross-listed undergraduate and graduate course codes. Here is a PDF with more information, along with an S&DS Major checklist and a MA Checklist that you can use to start planning your schedule. 

If a course has both an undergraduate and a graduate number associated with it, which number should I enroll in?

Prior to formal admission into the program, you should always enroll in the graduate number. Following admission, you should speak to the DUS to plan out which courses you plan to use fulfill the requirements of the B.S./M.A.

How many courses are required for the B.S./M.A.?

In short: you must take 20 courses in the S&DS department, 8 of which are taken at the graduate level.

In depth: per YCPS, the M.A. portion of your degree generally requires eight or more courses at the graduate level in addition to the standard requirements of the undergraduate degree. If you are pursuing the B.S./M.A., you can expect to take 14 credits in fulfillment of the requirements of the B.S. in S&DS and 8 additional credits to earn your M.A. Then, since you can use two courses at the graduate level for both your undergraduate and graduate degrees, that means you must take a total of 20 courses in the S&DS department, 8 of which are taken at the graduate level.

What courses are acceptable for the M.A. Degree, and when should I take them?

The 8 courses that are taken to complete the M.A. must cover the following four topic areas: probability, theory of statistics, data analysis, and computing/algorithms. These courses must all be at the graduate level. For any course that you would like to consider for the M.A., please consult the DUS to see if it would be appropriate and eligible for the degree. 

Note that these 8 courses cannot be entirely concentrated within your last year at Yale, and you must take at least six courses outside of the major within your last two years at Yale.

Programming

How much programming will I learn?

Computing is a core component of the S&DS major, and over the course of a B.A. or B.S., you will gain plenty of experience in using a programming language to aid you in your data analysis. You’ll find computing woven throughout courses in the major, and the hope is that at some point into the major, you have some idea of what to do if someone throws you a large, million-row dataset and asks you to come up with an interesting, actionable conclusion from the data.

At the same time, it’s important to realize that programming does not automatically solve your problems in statistics and data science. Any modeling you do is only as good as the assumptions that you make, which is why it’s really important to learn what those assumptions are and how to assess whether those assumptions are reasonable.

Will I learn more R or Python in the major?

Just a few years ago, most students moving through the major would come out with a strong foundation in R, but most or all of them likely wouldn’t have touched Python at all. That’s changing, now that courses like S&DS 262 (Computational Tools for Data Science) and the machine learning courses like S&DS 265 and 365  are now being taught in Python.

It still depends a lot on the specific courses you take to fulfill the requirements of the major. Generally, if your coursework focuses more heavily on data analysis, statistical inference, econometrics, and causal inference — and if you take many courses in adjacent departments like Economics and Political Science — you will find yourself more exposed to R, since these are disciplines that rely much more heavily on R in applied research. However, if your coursework has a stronger emphasis on machine learning, and if you perhaps find yourself taking courses in the Computer Science department, there’s a good chance you will end up with a strong foundation in Python.

Python and R are each good for different things — there’s no one answer to the question of whether R or Python is better, and it’s more important that you have a solid grasp of the basics of both so that you can build on that foundation for whatever project you’re working on. The similarities of both languages are widely recognized. Once you know one language, it’s really easy to pick up on the other language, so don’t worry that you won’t get hired for a job just because you know one language but not the other.

Are there any programming prerequisites to the major?

There are no formal prerequisites. Some students come in having taken one or a few computer science courses, but this is generally not needed. If you are interested in eventually taking data science-focused courses in the Computer Science department (such as CPSC 477 Natural Language Processing or CPSC 470 Artificial Intelligence), you may wish to begin taking the computer science sequence (CPSC 100, 201, 223) as early as you can to ensure that you meet relevant prerequisites. Otherwise, you will learn the programming you need through courses in the S&DS department.

Advising

How do I speak with the DUS?

In the beginning of each semester (and during the pre-registration period), the DUS will typically host DUS office hours. These will be announced weekly through S&DS major email list.

At other times during the semester, you can email the DUS at sds-dus@yale.edu at any time to schedule a time to chat.  We are happy to speak with students.

I am a sophomore looking for a college advisor in S&DS. How do I find one?

Any S&DS faculty member can be a sophomore advisor. We suggest you review faculty members’ websites and identify a few that are working on things of interest to you. Then, contact the faculty member and see if they are willing to advise you. (Note that once you are a declared major, the DUS automatically becomes your advisor.)

Where can I speak to upperclassmen peers about their experiences in the major?

The DSAC is a good place to start: any of the students on DSAC would be more than willing to speak with you about their experiences in the major, course recommendations, job search advice, and any other questions you might have. See the next section for the current list of DSAC members.

 

Getting involved with the department

What are the responsibilities of undergraduate learning assistants (ULAs)?

ULAs are Yale’s version of undergraduate teaching assistants. In S&DS courses, they primarily:

  • Grade quantitative homework assignments and provide feedback to students

  • Hold office hours to answer students’ questions about course content

  • Discuss student feedback and experiences in the course

The role typically takes 5-10 hours per week with an average of 7.5. You generally don’t need to have taken a course to be a ULA for the course, as long as you’ve taken higher-level courses or have attained similar skills in other similar courses. For a formal job description, see Yale’s page here (scroll down to “ULA Responsibilities”).

How do I become a ULA?

The department typically sends out ULA applications for a given semester at the end of the previous semester. The best way to ensure you receive notification of applications is by making sure you’re subscribed to the undergraduate email list.

I think the S&DS department should teach a course on dark magic, hire a professor from Hogwarts, or buy an inflatable unicorn for Cross Campus. Where do I go?

Great ideas! Email the DUS at sds-dus@yale.edu to relay your feedback. If you’d like to help make these changes happen, you should also join the DSAC, which organizes projects and initiatives to improve the student experience in the department.

What does the departmental student advisory committee (DSAC) do?

Broadly speaking, the DSAC makes the S&DS student experience better, smoother, and more welcoming. Activities vary from year to year, but some initiatives have included:

  • Organizing bluebooking events for students to ask questions about the major

  • Organizing social events for students to hang out and get to know each other

  • Developing resources for students, including the writing of this FAQ

  • Organize sweatshirt and swag orders for the department

  • Informally advise the DUS on the major by relaying student feedback

The DSAC can also be useful for undergraduates to seek advice about the major.

How do I join the DSAC?

Email the DUS at sds-dus@yale.edu.  We’d love to have you!

Where can I find research opportunities in statistics & data science?

These listings of faculty and  affiliates of S&DS, as well as members of the Institute of Foundations of Data Science (FDS), are a good place to start.  See also the S&DS 491/492 Senior Essay webpage for a list of previous FDS Project Match events to get an idea of some of the data science research that is happening at Yale by FDS members.. 

You may wish to reach out to S&DS professors and ask about research opportunities for undergraduates. You might also reach out to professors in adjacent departments such as Computer Science, Political Science, Economics, Global Affairs, Public Health, Biostatistics, and Management (and many others), given that much of modern social science research is reliant on statistical methodology. In particular, there are several professors who have joint appointments in Statistics & Data Science and Political Science who may be conducting interesting applied research, where there are often more opportunities for undergraduates to get involved.