Wolfram Computation Meets Knowledge

Wolfram Summer School

Alumni

Corwin Kerr

Science and Technology

Class of 2019

Bio

As an undergraduate at North Carolina State University, Corwin Kerr developed sensors to improve a continuous microfluidic reactor for the study of colloidal semiconductor nanocrystals and published in Lab on a Chip. He will embark on a PhD in chemical engineering at the University of Michigan in the fall of 2019 with an interest in modeling nanoscale systems. He has played flute for over ten years, most recently in the pit orchestra for musicals at university. He enjoys traveling, going on walks, taking in symphony performances, and exploring connections between disparate disciplines.

Computational Essay: Exploring Food Data

Project: Convert Chemical Line Structure Diagrams to Graph Representation

Goal

Efficiently finding information in archives of chemistry manuscripts or patents requires the ability to recognize the printed diagrams of chemical structures. The project explores the translation of a chemical diagram to its chemical graph representation in the Wolfram Language, which can instantiate a molecule as a list of atoms and a list of labeled bonds.

Summary of Results

I derived the graph representation of molecules in certain cases. I used line detection and filtering to find key lines in an image of a chemical structure. I then grouped parallel and nearby lines to remove duplicate lines. I identified likely atom positions and built the list of connected edges by tracing these vertices through their adjacent lines to other vertices. I created a package, ChemicalLines, for the approximately 18 functions needed for handling and grouping lines and vertices.

Future Work

Certain parameters require generalization to allow for molecules printed at different sizes. To associate text-recognized atoms with vertices, the function MorphologicalComponents can be explored for feature extraction, followed by filtering out the non-text using the previously identified lines. The location of the recognized text could be mapped to the locations of identified vertices. Finally, the code could be deployed to the Wolfram Cloud to allow anyone to upload a structure for processing.