Lecture 11¶
Announcements¶
- Midterm is graded
- Will be returned at the end of class
- (very) approximate distribution $\sim \mathcal{U}_{[30,50]}$ out of 60 across both 476 and 576
- Exam wrapper due Sunday night (extra credit, but please do it!).
- I can issue refunds if:
- my wording led to a misinterpretation of the problem
- my solution is wrong or arguable
- P1 graded, artifact votes are in
- G1 graded
- Artifacts
- Homework self-reflection (required): Canvas survey - reflect on homework progress and give yourself a grade as of the midterm.
Goals¶
- Be able to explain why panorama input images need to be taken from the same position
- Understand the mathematics of perspective image formation under a pinhole camera model, and associated terminology:
- Center of projection, projection plane, focal length, optical axis
- Know what is meant by world coordinates, camera coordinates, and image coordinates.
- Understand the relationship between depth and disparity in a simple rectified stero camera pair
# boilerplate setup
%load_ext autoreload
%autoreload 2
%matplotlib inline
import os
import sys
src_path = os.path.abspath("../src")
if (src_path not in sys.path):
sys.path.insert(0, src_path)
# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2
# codebase imports
import util
import filtering
import features
import geometry
Stitching 360 Panoramas¶
Can we make a 360 panorama with the tools we have?
A useful perspective: homographies are 3x3 linear transformations on planar images, which then get projected back onto a single plane.
Outline¶
segue from panoramas with image formation in mind
- aside about spherical stitching
panoramas need a common COP - why?
- if not, then position depends on depth
- hang on a minute, can we use that?
Pinhole camera model
- 476cam mk I: not a camera
- 476cam mk II: pinhole camera
- 476cam mk III: math camera, with terminology
- pinhole = center of projection
- image plane = projection plane
- focal length
- optical axis
- camera coordinates
- image coordinates
- HW1
Where we're headed: the camera matrix - one matrix to project them all!
- 3D world points to 2D pixel coordinates via
- Extrinsics: world to camera
- Projection: 3D to 2D
- Intrinsics: camera to image
- Enables: depth from disparity / stereo; sfm/mvs, slam, ...
- 3D world points to 2D pixel coordinates via
Pinhole projection
- f=1 case
- 10th grade way - geometry; HW2
- 15th grade way - linear algebra; HW3
- f=1 case
Camera coordiantes: P2 where points normalize onto the image plane
- f=$f$ case:
- 10th grade way - geometry; HW4
- 15th grade way - linear algebra; HW5
- f=$f$ case:
Depth from disparity: 10th grade way for the simplest case
- HW6-7
Rectified stereo:
depth from disparity reduces stereo vision to the correspondence problem
assumed a simple case: this is the rectified case where (assumptions)
correspondence - sounds familiar, but now it's dense. some metrics:
- SSD - sum of squared differences
- CC - cross-correlation: filter the right scanline with the left patch; where product is highest, call it a match; in practice, use NCC instead:
- NCC - normalized cross-correlation: standardize (subtract mean, divide by std) patches before multiplication to add invariance to photometric changes
The cost volume: given a matching cost c:
for i in rows: for j in columns: for d in disparities: C[i, j, d] = c(img1[i,j], img2[i,j+d])
(note that c will usually look at a patch around img[i,j])