The Atmospheric Remote-sensing Infrared Exoplanet Large-survey (ARIEL) is the fourth medium-class mission in the European Space Agency’s Cosmic Vision program. It will be a four-year mission during which ARIEL will study what exoplanets are made of, how they formed and how they evolve, by surveying a diverse sample of about 1000 extrasolar planets, simultaneously in visible and infrared wavelengths. It will be measuring the chemical composition and thermal structures of hundreds of transiting exoplanets providing new insights on planetary science beyond our solar system.
How I came up with the Hybrid Model for The ARIEL Machine Learning Challenge?
In 2019, the ARIEL Data Challenge Series was launched to build a global community for exoplanet data solutions. The objective was to use Machine Learning (ML) to remove noise from exoplanet observations caused by starspots and by instrumentation in the simulated data from the ARIEL telescope.
As I love space, robotics, and machine learning, this challenge interested me and I entered the challenge. I registered myself as a solo team. I then worked over several weeks in my free time after school to understand the challenge objective, learn the science behind exoplanets transit light curves, and how the ARIEL telescope gathered data.
Over 150,000 simulated observations were provided under the challenge alongside transit light curves in 55 different wavelengths and 300 time-step data points. In addition, 6 stellar parameters and planet-star radius ratios were provided. It took a lot of time to pre-process the data, rearrange it, divide it into training and testing data before running different machine learning models to determine the most accurate one.
The model which I finally came up with was a hybrid machine learning model. The Model uses the Long Short Term Memory (LSTM) Model, a form of Recurrent Neural Network (RNN) to handle the time series (or sequential) data such as transit light curves. It uses the Feed-Forward Neural Network to handle the numerical data such as mass, radius, temperatures, period, and the magnitude of stars generated by ARIEL. I then applied the Concatenate Layer to merge the two machine learning models before passing it through Dense Layers to get the output. This hybrid model provides a higher level of accuracy and outperforms LSTM only model.
Short Video of my Machine Learning Model
I have made a four-minute video that gives an overview of the challenge objective, datasets and pre-processing, methodology, the machine learning model applied and the results obtained. The video can be viewed at https://www.youtube.com/watch?v=y2ZWrmPqF-E
To expand the community of people interested in using Machine Learning on Space Data, I created a free online tutorial using Jupyter Notebook on Applying Hybrid Machine Learning to the ARIEL Simulated dataset.
It takes less than 3 hours to complete and several participants have already completed it. On 13 October 2019, I conducted a 3 hours Training Workshop at the MIT Media Lab for the participants of the Global BioSummit. I used the online module to train the participants in applying machine learning to the ARIEL dataset.
Gold Medal, Canada Wide Science Fair 2021, NASA SpaceApps Global 2020, Gold Medalist – IRIC North American Science Fair 2020, BMT Global Home STEM Challenge 2020. Micro:bit Challenge North America Runners Up 2020. NASA SpaceApps Toronto 2019, 2018, 2017, 2014. Imagining the Skies Award 2019. Jesse Ketchum Astronomy Award 2018. Hon. Mention at 2019 NASA Planetary Defense Conference. Emerald Code Grand Prize 2018. Canadian Space Apps 2017.