CS 497P / 597P: Computer Vision, Fall 2020
Project 4: ConvNets

Brief

  • Assigned: Tuesday, November 17, 2020
  • Due:  Monday, November 30, 2020 (9:59pm) (submit via Canvas)
  • Teams: This assignment may be done individually or in teams of 2.

Synopsis

In this project, we will be visualizing and manipulating AlexNet [1]:

For this project, we are using PyTorch, an open-source deep learning library that allows an efficient implementation of CNNs. Other similar libraries include Torch, Caffe, and TensorFlow.

Some parts of this assignment were adapted/inspired from a Stanford cs231n assignment. The parts that are similar have been modified heavily and ported to PyTorch. Thanks are due to the assignment's original creators from Stanford, as well as Noah Snavely, Kavita Bala, and various TAs who have further developed and refined this assignment.

The assignment is contained in an IPython Notebook; see below.

[1] Krizhevsky et al, "ImageNet Classification with Deep Convolutional Neural Networks", NIPS 2012

Colab

Google Colab or Colaboratory is a useful tool for machine learning, that runs entirely on cloud and thus is easy to setup. It is very similar to the Jupyter notebook environment. The notebooks are stored on users Google Drive and just like docs/sheets/files on drive, it can be shared among users for collaboration(you'll need to share your notebooks if you'll be doing this in a team of 2).

Setup

  1. You can find the Colab notebooks for this assignment in this Google Drive folder.
  2. There will be two Colab notebooks, Main and MNIST Challenge. The MNIST challenge notebook is for 597P students or those undergrads looking to complete extra credit.
  3. For each notebook you intend to work on, create a copy in your own Google Drive by right-clicking on the notebook file and selecting "Make a Copy".
  4. If you're working with a partner:
    • Share the notebook with your partner using Google Drive's sharing settings.
    • Choose and join a group on Canvas from the P4 Groups.
  5. Double clicking on the notebook in Google Drive should give you an option to open it in Colab.
  6. Alternatively, you can download the notebook and upload it directly to Colab:
    File -> Upload notebook...
  7. If you haven't used Colab or Jupyter Notebooks before, first read the Colaboratory welcome guide. In particular, it's useful to know about features in the Runtime menu like "Run all", "Run before", and so on, to save you time on executing various notebook cells.
  8. You will find the rest of the instructions in the notebook. Colab requires almost no setup, so there is no need to install PyTorch locally.

TODO

There are several tasks in this assignment, but each piece is just a few lines of code. you should expect to write less than 10 lines of code for each TODO .

  1. Visualize AlexNet structure (TODO 1).
  2. Classify Dogs vs Food (TODO 2, 3).
  3. Visualize class saliency (TODO 4).
  4. Fool AlexNet into making wrong predictions (TODO 5).
  5. Visualize a learned class (TODO 6)
  6. (597P Only) Implement neural net to train on MNIST dataset.

Tests: to verify the correctness of your solutions, you can run tests at very end of the notebook

Detailed instructions for each task are given in the notebooks.

MNIST Challenge

597P students must design and train their own neural network to classify handrwitten digits in the MNIST dataset. You will be given an example network and your aim should be to improve the accuracy while being under the specified parameter limit. Look at the MNIST notebook for skeleton code and more instructions.

Submission

  1. Execute all cells in your completed notebook file and download it as ipynb:
    File -> Download .ipynb
  2. Upload the completed notebook to Canvas
  3. If you are doing the MNIST Challenge submit that notebook as well.

Rubric

Each correctly implemented TODO is worth 5 points, for a total of 30 points for 497P and 35 points for 597P. Correctness will be checked using the tests at the bottom of the notebook.

What should my images look like?

This section contains images to illustrate what kinds of qualitative results we expect. Your results may not be identical. However, your solution should (a) pass the tests at the bottom of the notebook and (b) make sense when you run the given part of the notebook. For example, the fooling images should be classified incorrectly with high confidence and look nearly indistinguishable from the original, and your flamingo class visualization should have noticeable flamingo-like structure.

Saliency: we expect that pixels related to the class have a higher value. Left: Input image. Right: saliency.

Fooling image

These images look nearly identical, and yet AlexNet will classify each image on the middle as "snail". If you look really closely you can notice some tiny visual differences. The right image shows the difference magnified by 5x (with 0 re-centered at gray).

Class visualization

These images are classified as 100% belonging to different classes by AlexNet. If you run these for longer or adjust the hyperparameters, you may see a more salient result.

Many classes don't give very good results; here we show some of the better classes.

strawberry throne mushroom
tarantula flamingo king penguin
goblet sax llama
cloak moped indigo bunting
bulbul squirrel monkey cock