Data science students get a sampling of potential research areas

January 18, 2019

It was standing room only for the latest “Project Pitch” event at the Department of Statistics and Data Science, in which faculty members outlined research projects that could use a helping hand from a young data scientist. The topics came from across the university, with projects involving linguistics, public health, neuroscience, economics, education, climate change, biology, and the behavior of owl monkeys. All Ph.D. students in statistics and data science are required to spend a semester working on a project with faculty from outside the department. In addition, many seniors majoring in statistics and data science conduct senior research projects with faculty members from other departments. Conversely, there are faculty members throughout Yale who need data science expertise for ongoing projects. “This works to everyone’s benefit,” said acting department chair Daniel Spielman, Sterling Professor of Computer Science, and professor of statistics and data science. “By the time our undergraduates are doing senior research projects or our Ph.D. students are doing their practical project, they are well educated in modern methods of statistics and data science, and so can be very helpful. But in order to function well as data scientists our students need to learn about the realities of the data and problems faced by faculty in what data scientists call ‘domain areas.’ It also takes them a while to learn how to communicate effectively with faculty in other departments, so these projects form an important part of their education.” Spielman hosted the Jan. 14 pitch event. Ten faculty members presented potential research projects and fielded questions from students. Damon Clark, associate professor of molecular, cellular, and developmental biology, talked about a project seeking to understand how the brain detects and interprets visual motion; linguistics professor Claire Bowern outlined a project involving 4,000 audio samples of North American dialects; Margaret Corley, a postdoc working with anthropologist Eduardo Fernandez-Duque, described a project that will compare demographic, behavioral, and genetic data of owl monkeys in Argentina. The standing room only crowd listens to a faculty presentation. Organizers of the event said the presentations, as well as the questions, illustrated the wide range of applications for data science — and why such analytical techniques are needed. For example, Emily Erikson, an associate professor of sociology, sought a data scientist who will help analyze 2,000 documents relating to changes in economic theory in the 17th century. “What is the difference between having an algorithm review 2,000 documents, as opposed to an expert reviewing 20 of them,” one student asked Erikson. “Because reviewing 20 doesn’t give you the full picture,” Erikson explained. “That’s already been done.” Spielman said that beyond the idea of matching students with research projects to meet academic requirements, the event helps build stronger bonds between the Department of Statistics and Data Science and the rest of campus. It also comes at a time when the university has named data science as a top science investment priority for the future. “It is very difficult to know which faculty members’ research interests might be related, but if we put many faculty in a room and get each to say a little about what they are working on, they will discover the connections themselves,” Spielman said. Courtesy of YaleNews January 17, 2019