I am 12 years old. Three years ago I started to program in Python. Since last year I have been interested in machine learning. I have been taking free online courses, attending Meetup group meetings, and attending lectures on Artificial Intelligence. I am fascinated by the topic and have started to make several programmes of my own.
I am also interested in space and robotics and wanted to combine machine learning with real-time robotics. So I designed a Make A Robot Smile (MARS) Robot which would scan faces around it and turn a happy face when it scans a happy face in front of it, and a sad face when the person in front of it is sad. I worked alongside my sister Arushi (9 years) to create a robot using household objects. She likes robotics and programs in Arduino.
There were 4 parts to my project.
- Face Detection
- Face Expression Detection
- Evaluation and Analysis
At the start, the camera is taking 10 images per second of its surrounding. It is scanning each image for features that could represent a face – like a partially round oval, with 2 smaller circles towards the top (possibly eyes) and a curved line towards the bottom (possibly the mouth). To do this, I incorporated a python image filter in my program provided by Open Computer Vision (OpenCV).
Facial Expression Detection
When the program finds a face in one of the images, it takes 20 more images over a 4 second period to evaluate the emotions displayed on the face. It takes multiple images instead of one snapshot so that it can make a more accurate prediction of the emotion being displayed continuously.
For each of the 20 images, it crops the person’s face, reduces the image size to 48 x 48 pixels and converts it to grayscale (for faster process of the images). It then uses the machine learning algorithm I wrote in python with Tensorflow and Keras to read the emotion on the person’s face, either Happy / Sad / Angry / Surprised / Neutral.
Facial Expression Evaluation and Analysis
I used a Convolutional Neural Network (CNN), inspired by Sefik Ilkin Serengil’s work on Facial Expression Recognition with Keras (http://sefiks.com/2018/01/01/facial-expression-recognition-with-keras/). I trained it on 30 thousand labeled images of faces displaying different emotions. Through these images, the algorithm was able to learn how to extract the important features of a face, like the eyes and the mouth, give them appropriate weightage and calculate the emotion on the face. For my testing database, I used a database of 2000 images. When I tested my model on these pictures, I achieved an accuracy score of 67%.
But this was not enough for me. I decided to enlarge my training database using data augmentation, which means making copies of each image in the database and slightly altering them, like slightly rotating them, or shifting them, or even slightly blurring the image. Even though the new images created are all of the same people, almost all Neural Networks distinguish them as different people. In my case, I made 4 more copies of each image in my training database. Each of the four images was randomly rotated at an angle between -25 to +25 degrees. After the data augmentation process, I had a training database of 150 thousand images. I retrained my CNN with my new database, and I achieved an improved testing accuracy of 86%, almost 20% more accurate than before!
After calculating the emotion on each of the 20 images, my program takes the mode of all the emotions found and prints it out. It then assigns each evaluated emotion a number. This number is then sent to the Arduino micro-controller board using a serial bridge, also written in python.
The robot was created using household stuff, including leftover Ikea shelves and broken toys, and powered by Arduino and 5 servos for different movements. When Arduino receives the number corresponding to a Happy Emotion, it turns the robot’s face to a Smiling Face and the robot puts both its arms in the air.
If it receives the number corresponding to a Sad or an Angry face, the robot turns its face to depict a sad face and points one of its arms forward. When it receives the number corresponding to a surprised face, it puts one of its arms upwards. Finally, if it receives the number corresponding to a neutral face, it will go into the default position – both of its arms are down, and the robot’s face is sad.
I plan to improve the robot to include other facial expressions it can detect such as scared and disgusted. My Bot was able to collect some random data during deployment which I will use to improve my testing and training database.