Exterior of the Beinecke Library features powerful stone geometry and marble panes

4910/4920 Senior Essay

4910/4920 Senior Essay

S&DS 4910/4920 Senior Thesis is an opportunity to apply what you have learned in your studies to an independent research project, under the mentorship of an advisor, on a topic of mutual interest. You will gain experience with formulating a problem, with well-defined questions that you would like to answer, developing a plan of attack, executing that plan, and interpreting and communicating your findings.

You will be responsible for submitting the following:

  • Project proposal with timeline. A short description of your problem, methods you will use, the data you’ll use (if applicable), a set of deliverables, and a rough timeline of project milestones.
  • Poster. A poster summarizing your project that will be presented at a poster session at the end of the academic year.
  • Report. A written report describing your work.

We invite you to read further to learn more. You may also check out our FAQs.

Attention

Students currently in S&DS 4910/4920 should use the Canvas page for announcements, updates, submissions, etc. for the current semester.

Advising

Choosing an Advisor

When choosing an advisor make sure they are available to meet with you at least once a week (online or in person) and that they are good about responding to emails in a timely manner. An advisor who doesn’t meet very often is one of the biggest sources of frustration for students, so it is important that you ask if they will be able to meet regularly.

Students who have a project in mind can shop it around to try to find an advisor for that project. Students who don’t have a project in mind should identify faculty working in areas of interest to them. Advisors can be from S&DS or any other department at Yale. Typically, they must be a Yale faculty member with a Ph.D. If you want to work with someone who doesn’t fit that description (e.g. a post-doc), please talk with me. I strongly urge you to contact me a week or two before the semester to discuss possible projects and advisors if you haven’t determined your project already.

Poster 6

Project topic

Projects can be applied, computational, or theoretical, and can be basically anything that is relevant to statistics and data science, broadly defined.

Since data and statistical reasoning are everywhere, there is a very wide range of topics that are possible. We have had students do projects related to history, sports, finance, medicine, political science, environment, mathematics, energy, music, chemistry, media, economics, and more. And that was just from one semester. We have assembled a list of project abstracts from past students which can be used as a resource as you consider your own project.

The priority is your own academic and professional development. Choose a project where you will learn, develop skills, and gain experience that you can use after graduation, provided that it is related to statistics and data science, broadly defined. Also, choose a topic that interests you. Chances are you will enjoy the project more and be more motivated to work on it if the subject matter is interesting to you.

If you are currently working in a lab, you could develop a project closely related to your work there. Note that you cannot get both payment and course credit for the same project. So, if you are getting paid by the lab, then your senior project should be clearly distinct from the project that you are getting paid for. This is usually not a problem, as it is typically easy to develop a project that is strongly related to, but clearly distinct from, your other project.

If you did a capstone project in S&DS 4250, and you want to continue working on that project for your senior project, you are free to do so. You can also use the same data set if you’d like. But your work for your thesis will need to be clearly distinct from the work you did on your capstone project. This could mean you try to answer a completely different question, you answer the same question but using different approaches, or you make clear improvements to the approaches you used for your capstone project. Please feel free to reach out to me to discuss your project beforehand as well. If you have an advisor, they can help you develop an appropriate plan of action for your thesis as well.

Poster 4

Data Science Project Match

Near the beginning of the fall semester in August and near the end of the fall semester in November there will be Data Science Project Match events being organized by the Yale Institute for Foundations of Data Science. There will be about 10 faculty who each give 5-minute presentations on potential projects they are interested in working on that have a data science component. Afterwards students and faculty can hang out to talk about projects of mutual interest. This could be a good way to get project ideas and/or an advisor if you don’t currently have anything in mind.

There are links and documents which depicts previous Data Science Project Match events, which can be viewed for any ideas for what will be presented. Each has a list of faculty members who presented, and a link to all the abstracts from all of the project talks.

Past Data Science Project Match Events

We have assembled examples of presentations from past projects for your reference.

Project topic

Proposal

After getting pre-approval you need to submit a project proposal. The project proposal should be 2-3 pages long and include: an overview description of the problem area, your specific tasks along with a timeline if possible, and your expected deliverables for the project. The proposal should have most of these in it:

  • A short description of the problem area and why it is important and interesting
  • A problem statement
  • A discussion of the methods, strategy, techniques, etc. that you’ll use
  • If data is involved, do you need to collect it, clean it, etc.
  • A list of deliverables for the project
  • A rough timeline of project milestones

A template will be provided on Canvas.

Data

If data is involved in your project then be sure that it is already available, or that you can obtain it quickly. Acquiring data is often the biggest time delay and largest source of frustration for students. This is a big reason that finding an advisor and project earlier than the official deadlines is highly recommended.

Poster 5

Poster and Report

At the end of the semester, you will submit a poster for the poster session, and a final report. Before these deadlines, you will submit a draft of your report and poster to Canvas and to your advisor for feedback. Please make sure your poster is 36x48 and in PDF format. Please see the assignment description for more details.

It will be up to your advisor as far as what should go into the poster and report. In case it helps, there is a Report Template on Canvas that I have given students in S&DS 4250 Statistical Case Studies class. I should note though that this template 

  • Assumes the project focuses on hands-on analysis of data, which may not be what your project is
  • Is just a guide even for those projects, and may not be appropriate for every project since every project is different,
  • Was written by me, not your advisor, who may have different ideas about what the paper should look like. 

This may give you some ideas about what kinds of things you might include in the paper, but you’ll definitely want to discuss this with your advisor. The template is not meant to be prescriptive.

Outstanding Thesis Award

The S&DS Outstanding Thesis Award is awarded annually to S&DS seniors who have accomplished the most outstanding senior projects in statistics and data science. Students are nominated by their advisors before the poster session so the award committee can be sure to visit their poster and talk with them during the poster session. Outstanding Thesis Awards are presented during the S&DS Commencement luncheon in the afternoon after graduation.