Wolfram Computation Meets Knowledge

Wolfram Summer School

Alumni

Max Sakharov

Summer School

Class of 2014

Bio

Max is currently working on his PhD at Bauman MSTU. His work is devoted to the development of a population-based optimization method for distributed computing systems. He graduated from BMSTU in 2013 as a CAD/CAE/PLM engineer. Max studied various programming languages, including C/C++ and C#. For his master’s thesis, Sakharov developed and implemented a sequential memetic optimization algorithm based on the method of mind evolutionary computation. At the moment he gives lectures on nonlinear continuous optimization and works with his students on their term projects. Max also works as an application and sales engineer of Wolfram’s products in Softline Group. For several customers, he worked on implementation of a parallel processing algorithm using Mathematica’s built-in parallelization and CUDALink.

Max is interested in many different things, from basketball to quantum mechanics. He enjoys learning different foreign languages, reading, cooking tasty food and, as it may be clear by this point, playing basketball.

Project: Functionality for automated distributed optimization in machine learning problems

The main goal of this project so far is to develop a presumably internal function (at least its simplest version) for automated optimization of various classes of objective functions in the machine learning problems. For different problems (for example, for feedforward neural networks, linear and logistic regressions, and so on) we have different objective functions, and in each case one group of optimization methods can be more efficient than the other one. It’s natural in such a situation to try to come up with a heuristic, so-called adaptive strategy for governing various local optimization methods in order to apply the most efficient of them depending on the problem at hand, search domain, complexity and existence of derivatives, etc. It goes without saying that computing objective functions or their gradients may be computationally too expensive. For instance, when one trains a recursive neural network, it can take hours to perform one iteration if the number of parameters is huge, which is a common case in, say, natural language processing. In this context it could be interesting to also try to implement this sort of training in parallel. So optimization could be performed on different loosely coupled computing nodes, where each node owns only limited amount of data.

Favorite Outer Totalistic r=1, k=2 2D Cellular Automaton

Rule 86492