Assignment 7: Neural Networks and Reverse-Mode Automatic Differentiation

Assignment Description

In this assignment, you will implement a minimal library for reverse-mode automatic differentiation and use it to implement and train a neural network to recognize digits in (a subset of) the MNIST-digits dataset.

Answer the questions below in a “answers.txt” plain file, “answers.md” Markdown, or “answers.pdf” PDF. I will not accept Microsoft Word, OS X Pages, or OpenOffice documents. (I prefer Markdown, so I can see it from your repository on Github directly)

In addition, submit whatever code you use to answer the questions below.

Implementation

Finish the implementation of reverse-mode autodiff in autodiff.py, and write the helper functions for relu(), softmax()

Implement a fully-connected multi-layer neural network (with ReLU nonlinearities) to classify the mnist-digits dataset. Use the multiclass cross-entropy loss to train your neural network. Train the neural network using simple stochastic gradient descent, with a mini-batch size of 1. Experiment with at least three different neural network architectures, and at least two different numbers of layers. You will need to experiment with learning the learning rate to find a good number.

During your training process, monitor the misclassification rate on the validation dataset, and choose the best one over a certain number of epochs. (You can determine this manually.)

Questions

Hints

Data

The dataset for this assignment comes from LeCun, Cortes, and Burges.