CSCI 476/576 Project 4: Neural Radiance Fields

Scott Wehrwein

Winter 2024

Overview

In this assignment, you’ll implement a (slightly simplified) version of the technique described in the influential 2020 paper NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis by Mildenhall et al. Given a collection of images taken from multiple perspectives, along with pose information for those cameras, NeRF is an elegant technique that encodes a 3D volumetric representation of the scene in a neural network.

Dates

Assigned: March 5th, 2024

Deadline: March 11th, 2024 at 10:00pm

You will complete this assignment individually.

Setup

Skeleton Code

Accept the Project 4 assignment on Github Classroom and clone your repository. This repository contains the assignment’s single Jupyter notebook, which contains detailed instructions as well as skeleton code and TODOs for you to complete.

Hardware

This project requires the use of a GPU to train NeRF models using the Pytorch framework. The recommended approach is to use one of the lab machines in CF 420, CF 162, or CF 164, which have NVIDIA GeForce RTX-3060 GPUs with 12GB of video RAM.

Alternative approaches that should work but I cannot guarantee support for include:

You can confirm that you’re running on a machine with an NVIDIA GPU - and make sure that nobody else is using it heavily - by running nvidia-smi. You should see something like the following:

$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
|  0%   36C    P8     6W / 170W |     78MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     14191      G   /usr/lib/xorg/Xorg                 56MiB |
|    0   N/A  N/A     14239      G   /usr/bin/gnome-shell               17MiB |
+-----------------------------------------------------------------------------+

In this example, the GPU’s not in use except by Xorg and the window manager, so notice that under Memory-Usage, it shows 78MiB / 12288MiB (~0/12GB used) and GPU-Util is at 0%.

If you encounter OutOfMemoryErrors, you can restart the kernel from the Kernel menu in Jupyter. You can also watch nvidia-smi to monitor your memory usage and watch with satisfaction as your GPU utilization goes to 100% during training.

GPU Etiquette

To avoid holding resources unnecessarily, please don’t leave a Jupyter server running idle for long periods of time, especially if it’s running a notebook (such as this one) with a bunch of video memory allocated. After a session, shut down the Jupyter server and restart it next time you come back to work on the notebook.

Software

This project has similar software requirements to the prior projects, but also requires a few additional packages. The necessary Python requirements are given in requirements.txt, included in your repo. To set up an environment specifically for p4, go to your repo directory and run:

python3 -m venv p4env
source p4env/bin/activate
pip install -r requirements.txt

It may take a few minutes to install everything. If you want to augment your existing environment, just activate that environment and pip install requirements.txt there.

To run the notebook - as with the lecture notebooks - you’ll need to run a Jupyter server. If you’re sitting at the machine you’re running on, run the following in the repo directory (with the virtualenv from above activated):

jupyter lab

If you’re running remotely, then ssh to the machine you want to run on, then run the following in the repo directory (with the virtualenv from above activated):

jupyter lab --no-browser --port=PPPP

with your choice of 4-digit port number replacing PPPP.

Then, on your local machine, set up an SSH tunnel to map the remote machine’s server to a port on your local machine:

ssh -NL PPPP:localhost:PPPP remotehost

This command should just run and appear to hang - that means the tunnel’s open.

Finally, open a browser on your local machine and go to the link shown when you started up the server; it should look something like this:

http://localhost:PPPP/lab?token=a32821f9c655502c8c84535443f85cad2fca9fedecddb087

More detailed instructions for running Jupyter remotely can be found in the 476-lecture github repo’s readme.

Data

The input and test data for this project will be downloaded (and cached) by the notebook as needed. I recommend running the notebook from the same place when possible (or moving all the downloads along with it) to avoid having to re-download files. Do not commit the data files to your Github repository.

Tasks - Overview

There are several pieces we ask you to implement in this assignment:

  1. TODO 1: Fit a single 2D image using an MLP with and without positional encoding;
  2. TODO 2: Compute the origin and direction of rays through all pixels of an image
  3. TODO 3: Sample the 3D points on the camera rays
  4. Implemented for you: Compute compositing weight of samples on camera ray
  5. TODO 4: Render an image with NeRF

The instructions about individual TODOs are present in detail in the notebook.

Testing

To help you verify the correctness of your solutions, we provide tests at the end of each TODO block, as well as qualitative and quantitative evaluation at the end of the notebook.

Expected Outputs

2D Image Fitting

With no positional encoding:

With 3 frequencies:

With 6 frequencies:

NeRF Training:

The optimization for this scene takes around 1000 to 3000 iterations to converge on a single GPU (in 10 to 30 minutes). We use peak signal-to-noise ratio (PSNR) to measure structual similarity between the prediction and target image, and we expect a converged model to reach a PSNR score higher than 20, and produce a reasonable depth map. Here’s an output after 1000 iterations of training with default settings:

Here’s the resulting 360 video:

Submission

Execute your completed notebook in Jupyter and commit the resulting version of your .ipynb file to your github repository; make sure the cell outputs are contained in the notebook. Also commit the 360 video output (lego_spiral_001000_rgb.mp4) to your repository, and push to Github by the deadline.

Finally, fill out the P4 Survey on Canvas.

Rubric

Points are awarded for correctness and efficiency, and deducted for issues with clarity or submission mechanics:

Clarity Deductions for poor coding style may be made. Please see the syllabus for general coding guidelines. Points may be deducted for any of the following:

Acknowledgements

This assignment is based on an assignment from Noah Snavely; thanks to Noah and numerous underappreciated TAs.