CSCI 497P/597P Project 3: Stereo

In this assignment, you’ll implement the two-view plane sweep stereo algorithm. Given two calibrated images of the same scene, but taken from different viewpoints, your task is to recover a rough depth map.

Dates

Teamwork

You may work on this assignment individually or in groups of two. If you would like to work in a pair, you need to complete the following steps by the end of the day on Friday, November 6th:

Setup

Skeleton Code

In the Project 3 assignment on Canvas, you will find a GitHub Classroom invitation link. Click this link to accept the Project 3 assignment invitation and create your personal repository for this project. Your repository already contains skeleton code

Software

This project has similar software requirements to the prior projects. Two additional python packages are used:

Because these are pure python packages, you can install these in a virtual environment. Here’s my suggested approach:

# create a virtual environment in the 497_p3_env dir; reuse system-installed packages:
wehrwes@linux-07:~/497/venvtest$ python3 -m venv 497_p3_env --system-site-packages

# activate the virtual environment:
wehrwes@linux-07:~/497/venvtest$ source 497_p3_env/bin/activate

# notice that my shell prompt now has (497_p3_env) prepended to it; install packages:
(497_p3_env) wehrwes@linux-07:~/497/venvtest$ pip install nose imageio
Collecting nose
  Using cached https://files.pythonhosted.org/packages/15/d8/dd071918c040f50fa1cf80da16423af51ff8ce4a0f2399b7bf8de45ac3d9/nose-1.3.7-py3-none-any.whl
Collecting imageio
  Using cached https://files.pythonhosted.org/packages/4c/2b/9dd19644f871b10f7e32eb2dbd6b45149c350b4d5f2893e091b882e03ab7/imageio-2.8.0-py3-none-any.whl
Requirement already satisfied: numpy in /usr/lib/python3/dist-packages (from imageio)
Requirement already satisfied: pillow in /usr/lib/python3/dist-packages (from imageio)
Installing collected packages: nose, imageio
Successfully installed imageio-2.8.0 nose-1.3.7
# all set up! go ahead and work on your project

# when done working, deactivate the environment to go back to real life:
(497_p3_env) wehrwes@linux-07:~/497/venvtest$ deactivate
wehrwes@linux-07:~/497/venvtest$

To view the results of turning your depth maps into 3D meshes, you can install a mesh viewer such as Meshlab. I recommend installing this locally and copying meshes to your home machine even if you’re developing and running your code in the lab environment.

Data

The input data is large enough that it was inadvisable to include it in the github repository. Go into the data directory and run download.sh to download the required datasets (tentacle is hosted locally, while the remaining datasets are downloaded from the Middlebury Stereo page). You can also uncomment other lines to download and try out additional datasets if you’d like.

Alternatively, you can also download these datasets in a web browser and extract them into the input directory (for tentacle) or data (for all others). Here’s the direct link to the listing of Middlebury dataset zip files: https://vision.middlebury.edu/stereo/data/scenes2014/zip/.

Preview

where dataset is one of

('tentacle', 'Adirondack', 'Backpack',  'Bicycle1', 'Cable', 'Classroom1', 'Couch', 'Flowers', 'Jadeplant',  'Mask', 'Motorcycle', 'Piano', 'Pipes', 'Playroom', 'Playtable',  'Recycle', 'Shelves', 'Shopvac', 'Sticks', 'Storage', 'Sword1',  'Sword2', 'Umbrella', 'Vintage')

. Keep in mind except for tentacle and Flowers, you’ll need to modify data/download.sh to download any other datasets before running your code on them.

the output will be in output/tentacle_{ncc.png,ncc.gif,depth.npy,projected.gif}.

The outputs of the plane-sweep stereo for the tentacle dataset should look like this:

The first animated gif is tentacle_projected.gif, which shows each rendering of the scene as a planar proxy is swept away from the camera.

For this project, we use Normalized Cross Correlation (NCC) measure for matching scores. The second animated gif is tentacle_ncc.gif, which shows slices of the NCC cost volume where each frame corresponds to a single depth. White is high NCC and black is low NCC.

The last image shows the correct depth output tentacle_ncc.png for the tentacle dataset, which is computed from the argmax depth according to the NCC cost volume. White is near and black is far.

Tasks

Most of the code you will implement is in student.py, with the exception of the last task, which is to complete the main body of the planesweep loop in plane_sweep_stereo.py. It’s recommeded that you start by taking a look through the well-commented plane_sweep_stereo.py to get an idea of where these functions fit in. The functions to be implemented have detailed specifications - see those for details of what you need to do.

Testing

You are provided with some test cases in tests.py. Feel free to run these with python tests.py to help you with debugging. There are unit tests for all the functions you write, but not for the main program. You can, however, check that your output on tentacle matches the results shown above.

If the code is running slowly while you’re debugging, you can speed things up by downsampling the datasets further, or computing fewer depth layers. In dataset.py, modify:

to change the downsampling factor applied to the Middlebury datasets. The output image will be of dimensions (height / 2^stereo_downscale_factor, width / 2^stereo_downscale_factor).

Efficiency

We’ve configured the tentacle dataset such that it takes about 0.5-100 seconds to compute depending on your implementation. Because we’re using opencv to warp compute homographies and warp images, the main bottleneck will likely be preprocess_ncc. Some tips:

Mesh Reconstruction

There are no tasks for you to complete for this part, but the machinery is there and you’re welcome to try it out. Once you’ve computed depth for a scene, you can create a 3D model out of the depth map as follows:

You can open up output/<dataset>_depth.ply in Meshlab to see your own result for any of the datasets. Here’s the mesh from my tentacle result:

You may need to fiddle with settings to get the colors to show - try un-toggling the triangley-cylinder button two buttons right from the wireframe cube at the top of the screen.

597P Extensions / 497P Extra Credit

597P students must complete at least one of the following extensions. 497P students may complete one or more of these for extra credit.

Submission

Rubric

Points are awarded for correctness and efficiency, and deducted for issues with clarity or submission mechanics.

Correctness (35 points)
Unit tests (35 points)	Correctness as determined by `tests.py` (`score = ceil(n_passed*1.5`))
Stereo output (10 points)	Output on `tentacle` and `Flowers`
Efficiency (4 points)
4 points	`python plane_sweep_stereo.py tentacle` runs in under 30 seconds
P3 Survey (1 point)	P3 Survey is filled out (by both team members, if applicable)
Extensions (597P) (10 points)	At least one extension is implemented, analyzed, and documented thoroughly.

Clarity Deductions for poor coding style may be made. Please see the syllabus for general coding guidelines. Up to two points may be deducted for each of the following:

Acknowledgements

This assignment is based on versions developed and refined by Kavita Bala, Noah Snavely, and countless underappreciated TAs.

Overview