Hikari Sorensen has just finished her freshman year at Harvard College, where she studies mathematics and computer science. She really likes thinking about the structures underlying real phenomena, and especially loves abstracting things, considering things in their abstract form, which includes sentences in bios in which she talks about stuff she likes. She likes meta stuff, and also meta-meta stuff, and also meta-meta-meta stuff…
When her head’s not way up in the clouds (which, frankly, is not very often), she likes to do machine learning, or sing songs, or catch Pokémon, or (with nonzero probability) do all three at the same time. She also likes writing and talking philosophy, she supposes, although with regard to the latter, beware if you want to engage her, because she’s been known to get pretty heated in debate about these kinds of things. She’s most recently been enjoying how traditionally first-person expressions in writing sound when done in third person.
Project: Learning Weights in LeNet
Goal of the project:
Examine the change in weights in LeNet over the training process.
Summary of work:
LeNet was trained on MNIST and CIFAR-10, with weights and gradients saved at checkpoints along the training process. The weights and gradients were then visualized as images, upon which image difference and entropy analysis was conducted. The convolved images after training were also found, creating a visualization of the feature set upon which the weights were acting.
Results and future work:
- MNIST is a sufficiently simple dataset whose filters undergo very little change from randomness; most of the learning is in the linear layers. That is, logistic regression is sufficient to achieve high accuracy on MNIST digit classification, and the convolutional layers serve primarily as a way to compress image information efficiently.
- Training on CIFAR-10 results in filter learning. In this case, the random initializations of the filter weights have small magnitudes, which increase over training. This can be interpreted as maximizing image entropy of filters over the training process.
- Train sets of neural networks with different combinations of frozen layers, using different datasets and different network architectures.
- Create dynamic visualizations that update alongside network training, so that weight updates can be seen in real time.
- Try this kind of visualization for RNNs.