CSCI 497P/597P: Computer Vision, Winter 2019
Project 5:  ConvNets

Brief

  • Assigned: Tuesday, March 5, 2019
  • Due: Sunday, March 10, 2019 (9:59pm) (submit via Canvas)
  • Teams: This assignment should be done in teams of 2 students.

Synopsis

In this project, we will be visualizing and manipulating AlexNet [1]:

For this project, we are using PyTorch, an open-source deep learning library that allows an efficient implementation of CNNs. Other similar libraries include Torch, Theano, Caffe, and TensorFlow.

Some parts of this assignment were adapted/inspired from a Stanford cs231n assignment. The parts that are similar have been modified and ported to PyTorch. Thanks are due to the assignment's original creators from Stanford, as well as Noah Snavely, Kavita Bala, and various TAs who have further developed and refined this assignment.

The assignment is contained in an IPython Notebook; see below.

[1] Krizhevsky et al, "ImageNet Classification with Deep Convolutional Neural Networks", NIPS 2012

Colab

Google Colab or Colaboratory is a useful tool for machine learning, that runs entirely on cloud and thus is easy to setup. It is very similar to the Jupyter notebook environment. The notebooks are stored on users Google Drive and just like docs/sheets/files on drive, it can be shared among users for collaboration (you'll need to share your notebooks as you'll be doing this in a team of 2).

TODO

To be done in teams of 2.

There are many pieces to the assignment, but each piece is just a few lines of code. you should expect to write less than 10 lines of code for each TODO .

  1. Visualize AlexNet structure (TODO 1).
  2. Classify Dogs vs Food (TODO 2, 3).
  3. Visualize class saliency (TODO 4).
  4. Fool AlexNet into making wrong predictions (TODO 5).
  5. Visualize a learned class (TODO 6)
  6. (597P Only) Invert AlexNet features (TODO 7a, 7b).
  7. (597P Only) Devise and train your own neural network architecture to crush the MNIST dataset.

Tests: to verify the correctness of your solutions, you can run tests at very end of the notebook

The instructions about individual TODO are present in detail in the notebooks

Setup

  1. There will be two Colab notebooks, Alexnet and MNIST Challenge. 497P students need only the Alexnet notebook; 597P students, or others interested in extra credit, will need both.
  2. Download (you may need to Right Click/Save As) the appropriate notebook(s) using these links:
  3. Upload the requisite notebook(s) to Google Drive (WWU has Google Drive, so you can log in with your WWU credentials).
  4. Double clicking on the notebook on Google drive should give you an option for opening it in Colab.
  5. Alternatively, you can directly open Colab and upload notebook following this:
    File -> Upload notebook...
  6. If you haven't used Colab or Jupyter Notebooks before, first read the Colaboratory welcome guide.
  7. You will find, rest of the instructions in the notebook. As already stated, Colab required almost no setup, so there is no need to install PyTorch locally.

MNIST Challenge

597P students must design and train their own neural network for MNIST dataset. You will be given an example network and your aim should be to improve the accuracy while being under the specified parameter limit. Look at the MNIST notebook for skeleton code and more instructions.

What to hand in?

  1. Execute your completed notebook file, download it as ipynb
    File -> Download .ipynb
  2. Upload the completed notebook(s) to Canvas

What should my images look like?

This section contains images to illustrate what kinds of qualitative results we expect. If your images do not match these perfectly, do not panic. If your code passes the tests at the bottom of the notebook, it is considered correct. In many cases, better images can be achieved by simply training for more iterations.

Saliency: we expect that pixels related to the class have a higher value. Left: Input image. Right: saliency.

Fooling image

These images look nearly identical, and yet AlexNet will classify each image on the middle as "snail". If you look really closely you can notice some tiny visual differences. The right image shows the difference magnified by 5x (with 0 re-centered at gray).

Class visualization

These images are classified as 100% belonging to different classes by AlexNet. If you run these for longer or adjust the hyperparameters, you may see a more salient result.

Many classes don't give very good results; here we show some of the better classes.

strawberry throne mushroom
tarantula flamingo king penguin
goblet sax llama
cloak moped indigo bunting
bulbul squirrel monkey cock

Feature inversion (597P Only)

Note that we could probably obtain higher quality reconstructions if we ran the optimization for longer, or added a better regularizer. To keep things simple, your images only need to be mostly converged.

original conv1 conv2
conv3 conv4 conv5
fc6 fc7 fc8