Approaching Exoplanet Discovery with Statistics

June 22, 2020

Research group led by Debra Fischer (Dept. of Astronomy) and Jessi Cisewski-Kehe (Dept. of Statistics & Data Science) at Yale University 

Since the discovery of a planet orbiting the Sun-like star 51 Pegasi in the 1990’s (Mayor & Queloz, 1995), astronomers and the general public have increasingly gained an interest in discovering and understanding the nature of exoplanets. With the addition of the Kepler mission in 2009, the number of discovered exoplanets increased rapidly. And over the course of the last decade, many other efforts have been made to collect data that may lead to more exoplanet discoveries one at a time.

But with so many currently discovered exoplanets, why is this still a strong interest in the astronomical community? While there are various answers that could be given to this question, one that guides the research of a team at Yale University is the following:  none of the currently known exoplanets classify as an Earth-analog orbiting a Sun-analog. Every exoplanet has a particular mass and average distance from its host star. This host star also has a particular mass. These parameters give the exoplanet a particular orbital period. As the exoplanet orbits the host star, it exerts a (small but non-zero) gravitational force on the star, causing the star to oscillate between moving towards our telescope and away from it over time. This oscillating radial velocity has a maximum value (also called a semi-amplitude) that is also defined by the previously mentioned parameters. None of the currently known exoplanets have an orbital period and radial velocity semi-amplitude that are quite like that of an Earth-like, Sun-like system.

So why haven’t we yet discovered such a system? Under the assumption that our Solar System isn’t unique, there is likely to be other exoplanet systems in our Milky Way that resemble what we call home. One possible reason for the lack of Earth-like exoplanets discovered is the magnitude of the signal given by such a system. As a star oscillates due to the gravitational pull of an orbiting exoplanet, the light from that star changes from the perspective of the observer. Formally, the light’s wavelength gets either red-shifted or blue-shifted, depending on whether the star is traveling away or towards the observer, respectively. This effect is called the Doppler Effect. As an example, imagine your friend driving towards you, and consistently holding the car horn down, as you stand on the side of the road. As your friend passes you on the road you would notice the tone of the car horn become slightly lower. And the faster your friend drives, the more noticeable this Doppler Effect will be. But the maximum speed that a Sun-like star would be moving due to an Earth-like planet is approximately 0.01 m/s, making the change in the stellar light’s wavelength much harder to detect than a change in the car horn’s sound. In fact, the change in the wavelength of the star’s light is so small that we pretty much need to rely on mathematical and statistical tools, rather than visualization, to detect it.

While astronomers have developed various computational techniques that are able to detect such a small change, the difficulties are not yet over. A second hurdle exists as well. Stars out in space are not just solid bowling balls, but instead are composed of gas and plasma that are constantly in motion. All currently known techniques used to detect a periodic Doppler Effect in a star’s light easily fall prey to effects in the star’s atmosphere that can mimic the signal of an orbiting planet. A well-known example of such an effect is star-spots. This second obstacle can be considered even more significant than the first, due to the fact that these effects are not well understood. The Doppler Effect, which underlies the first obstacle, has been well studied for centuries. But there simply does not exist such a well-defined physical model for all the effects in a star’s atmosphere with enough accuracy.

The research group led by Debra Fischer (Dept of Astronomy) and Jessi Cisewski-Kehe (Dept of Statistics & Data Science) at Yale University is working to develop new techniques that can be used to overcome both of these difficulties together. One main focus to achieve such a goal is to use a statistical perspective to simplify the approach of detecting a change in the stellar light’s wavelength. The simpler the model, the easier it is to extend to be (at least partially) immune to certain effects in the star’s atmosphere. A post-doc, Bo Ning and a Ph.D. student, Parker Holzer (both of the Dept. of Statistics & Data Science) are two of the most actively engaged in developing new techniques. Pulling from the toolbox of data science, the techniques in development make use of many classical approaches such as Bayesian analysis, principal components analysis, linear regression, etc. But these many statistical tools are not being used on a free-rent basis. Every statistical model has assumptions made about the data, and the difficulty of how small an Earth-like signal is in the light of a Sun-like star requires that nearly every assumption be validated before being allowed. Otherwise the statistical approach may be too general and unable to pick out the small signal being targeted.

With the constant stream of high-fidelity data collected by the Yale astronomy team, and continual refinement of statistical techniques in the Data Science team, the likelihood of achieving the end goal is gradually increasing. Step one is to build a simplified approach of finding a periodic Doppler-shift in the light from a star. Another step is to develop a technique to accurately detect when activity in a star’s atmosphere may be influencing the derived motion of the star. And the final step is to unite these two first steps to detect Earth-like exoplanets orbiting Sun-like stars in a way that is (at least partially) immune to effects in the star’s atmosphere. The search for exoplanets continues and the team at Yale is gradually making headway.