WOLFRAM

Wolfram Summer School

Alumni

Wangping Ren

Science and Technology

Class of 2019

Bio

Wangping Ren joined the Wolfram Summer School in 2019. He obtained his Bachelor of Science degree in Mathematics and Physics from the University of Wisconsin-Madison in 2019. His research interests are in the field of experimental condensed matter physics in which he plans to pursue a Ph.D. Besides performing experimental data analysis, he loves applying computational methods to understanding and creating physical models.

Computational Essay: A 3D Virtualization of Atom Interferometry

Project: Extract Data From Discrete Plot Images Using Semantic Segmentation (Pixel Classification)

Goal

In this project, we present an automated data extraction system for discrete plot images. This project utilizes a semantic segmentation approach where image pixels are classified as being different parts of the image by using a neural network model called Dilated ResNet-22. By using image segmentation, we were able to extract numerical values of scattered plots data in Cartesian coordinates with randomized axes, color and axes range.

Summary of Results

By implementing the ResNet-22 neural network, it gave a pixel segmentation that classifies pixels as being backgrounds, axes, axes labels and data points, respectively. The segmentation makes it easy to distinguish the four parts of an input image; by using imaging processing techniques, a set of numerical data was obtained from the initial input image.

Future Work

Though our training data for the ResNet-22 neural network has randomized data points, axes ranges and data shapes, it is not diverse enough. In future work, we plan to include plots of different styles, different rotations, different label styles and so forth. We also plan to implement a second neural network to achieve optical character recognition (OCR) of various label numbers on the plot axes to increase the system’s automation level. Furthermore, we aim to generalize this approach to diverse types of discrete plots, which may include “Marketing”, “Detailed”, “Classic”, etc.