Wolfram Computation Meets Knowledge

Back to Research-Based Educational Programs

Wolfram High School
Summer Research Program

Formerly known as the Wolfram High School Summer Camp

Bentley University, Boston, MA June 25–July 13, 2024

Alumni

Daniel Shin

Class of 2019

Bio

Daniel Shin will be a junior at Chadwick International located in Songdo, South Korea during the 2019–2020 school year. At Chadwick International, he is involved with a VEX robotics team, programming club, economics research club, and varsity badminton. Outside of school, Daniel spends time researching about computer architecture, reading comics, and going on 9gag.

Project: Character Shape Analysis

Goal

The goal of this project is to create a program to recognize words off of an image. The most common AI-assisted image processing project to date is the handwritten digit analysis utilizing the MNIST dataset, which contains various data samples on handwritten digits, which are organized into uniform sizes. This project is an extension of this, evaluating handwritten characters (from the EMNIST dataset), and progresses to recognizing whole words and evaluating "possible" words that can be derived from writing.

Summary of Results

During this project, algorithms were designed to recognize words in an image. However, results weren't exactly as satisfying as expected. The accuracy of the trained neural network was quite high, but there were some other issues as well. Using character-based relationships, the program successfully impedes consonants from following consonants, such as "Q" following "M"; however, the replaced vowel wasn't always quite accurate. Other attempts were made, such as adding weights as well. Only after creating weights to decrease the influence of the character association map was the combination of the two probabilistic values yielding confident results.

Future Work

There are various methods on simulating OCR. For future works, creating neural networks to input whole words and sentences will be able to create a more feasible and easy-to-use product. Furthermore, utilizing similar concepts from this project, using word mapping (Markov chains), it might be possible to predict wrong characters or words in a sentence and to use image processing neural networks to compensate and replace certain characters.