Wolfram Computation Meets Knowledge

Wolfram Summer School

Alumni

Constantine Korikov

Science and Technology

Class of 2019

Bio

Constantine is a researcher with a background in physics and information technologies. He has a PhD in theoretical physics with a specialisation in the Casimir effect. Constantine did research at Intel Labs where he worked with signal processing, algorithms and digital hardware design. Then, at Syntacore company, he was developing a RISC-V processor. Constantine likes to take part in competitions and hackathons. He, as a part of the team, won the Wolfram Research Hackathon 2018 (Russia).

Computational Essay: Is the instantaneous Internet real?

Project: Code Embeddings for Wolfram Language

Goal

This project is intended to implement a system that shows the similarity between built-in functions of the Wolfram Language based on their context of usages. Illustrations should support the results. As an option, the solution can be used to generate names for expressions. For this purpose, it is helpful to use code embeddings, which map the code to the vector space.

Summary of Results

In our project, we tried to explore different forms of embeddings for the Wolfram Language. For this purpose, we gathered a lot of samples of code in the Wolfram Language, cleaned the samples and made them interpretable, trained a couple of classical neural network architectures, did experiments with a state-of-the-art method in source code embeddings and got images of embedding vector space in 2D.

Future Work

  • Find the optimal neural network architecture and parameters for building vector representation of the Wolfram Language
  • Verify other state-of-the-art approaches for embeddings [1]
  • Use the symbolic structure of the Wolfram Language to represent programs expressly for machine learning
[1] Z. Chen and M. Monperrus, “A Literature Study of Embeddings on Source Code.” arxiv.org/abs/1904.03061.