Lecture 11¶

Announcements¶

  • Midterm is graded
    • Will be returned at the end of class
    • (very) approximate distribution $\sim \mathcal{U}_{[30,50]}$ out of 60 across both 476 and 576
    • Exam wrapper due Sunday night (extra credit, but please do it!).
    • I can issue refunds if:
      • my wording led to a misinterpretation of the problem
      • my solution is wrong or arguable
  • P1 graded, artifact votes are in
    • Winning artifacts
  • G1 graded
    • Artifacts
  • Homework self-reflection (required): Canvas survey - reflect on homework progress and give yourself a grade as of the midterm.

Goals¶

  • Be able to explain why panorama input images need to be taken from the same position
  • Understand the mathematics of perspective image formation under a pinhole camera model, and associated terminology:
    • Center of projection, projection plane, focal length, optical axis
  • Know what is meant by world coordinates, camera coordinates, and image coordinates.
  • Understand the relationship between depth and disparity in a simple rectified stero camera pair
In [ ]:
# boilerplate setup
%load_ext autoreload
%autoreload 2

%matplotlib inline

import os
import sys

src_path = os.path.abspath("../src")
if (src_path not in sys.path):
    sys.path.insert(0, src_path)

# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2

# codebase imports
import util
import filtering
import features
import geometry

Stitching 360 Panoramas¶

Can we make a 360 panorama with the tools we have?

A useful perspective: homographies are 3x3 linear transformations on planar images, which then get projected back onto a single plane.

Outline¶

  • segue from panoramas with image formation in mind

    • aside about spherical stitching
  • panoramas need a common COP - why?

    • if not, then position depends on depth
    • hang on a minute, can we use that?
  • Pinhole camera model

    • 476cam mk I: not a camera
    • 476cam mk II: pinhole camera
    • 476cam mk III: math camera, with terminology
      • pinhole = center of projection
      • image plane = projection plane
      • focal length
      • optical axis
      • camera coordinates
      • image coordinates
    • HW1
  • Where we're headed: the camera matrix - one matrix to project them all!

    • 3D world points to 2D pixel coordinates via
      • Extrinsics: world to camera
      • Projection: 3D to 2D
      • Intrinsics: camera to image
    • Enables: depth from disparity / stereo; sfm/mvs, slam, ...
  • Pinhole projection

    • f=1 case
      • 10th grade way - geometry; HW2
      • 15th grade way - linear algebra; HW3
  • Camera coordiantes: P2 where points normalize onto the image plane

    • f=$f$ case:
      • 10th grade way - geometry; HW4
      • 15th grade way - linear algebra; HW5
  • Depth from disparity: 10th grade way for the simplest case

    • HW6-7
  • Rectified stereo:

    • depth from disparity reduces stereo vision to the correspondence problem

    • assumed a simple case: this is the rectified case where (assumptions)

    • correspondence - sounds familiar, but now it's dense. some metrics:

      • SSD - sum of squared differences
      • CC - cross-correlation: filter the right scanline with the left patch; where product is highest, call it a match; in practice, use NCC instead:
      • NCC - normalized cross-correlation: standardize (subtract mean, divide by std) patches before multiplication to add invariance to photometric changes
    • The cost volume: given a matching cost c:

      for i in rows:
        for j in columns:
          for d in disparities:
            C[i, j, d] = c(img1[i,j], img2[i,j+d])
      

      (note that c will usually look at a patch around img[i,j])