Wolfram Computation Meets Knowledge

Back to Research-Based Educational Programs

Wolfram High School
Summer Research Program

Formerly known as the Wolfram High School Summer Camp

Bentley University, Boston, MA June 25–July 13, 2024

Alumni

Shruti Panse

Class of 2019

Project: Detecting Malaria-Infected Blood Cells Using Machine Learning

Goal

Malaria is a life-threatening disease caused by parasites that are transmitted to people through the bite of infected female Anopheles mosquitoes. In 2017, there were an estimated 219 million cases of malaria in 87 countries. The goal of this project is to use artificial intelligence to detect malaria. A program will be created to learn the differences between malaria-infected and uninfected red blood cells in order to classify them by type. The program will use a neural network to classify images of cells into two classes. A dataset with images of infected and uninfected red blood cells, sourced from Kaggle, will be used to train the program. Once an image is inputted, the program will run and output a diagnosis of "infected" or "uninfected" based on the results.

Summary of Results

Overall, the data had an accuracy of 97% for both infected and uninfected cells. The neural network for non-augmented data had an error of 0.329% in the batch training and 7.53% in the validation set. In the augmented data, the batch training had an error of 3.72% while the validation set had an error of 5.84%. Furthermore, as displayed in the confusion matrix, there were 3336 examples of the neural network prediction matching with the actual results for true and 114 examples where the actual class was true and the machine predicted false. There were 3344 examples of the neural network and the actual results matching for false and 95 examples where the neural network predicted true but the actual class was false. Thus, the neural network was relatively even in predicting classes and didn’t favor one class over the other.

Future Work

To improve this project in the future, I could implement more augmented datasets to further train and improve the neural net. Furthermore, I could use different cell images from different datasets to prevent overfitting and increase accuracy. Lastly, I could implement a function that pinpoints the malaria in the blood cell by finding the edges of the cell and sensing the infected cells through the function, image partition and color detection.