Lecture 8¶
Announcements¶
- Midterm exam coming up a week from Thursday!
- Covers material through the end of this week
- P1 is in! P2 will be out Thursday!
Goals¶
- Know what kinds of transformations can be represented using linear and affine transformations.
- Know how to interpret homogeneous points with a third coordinate that is not 1.
- Understand the meaning of homogeneous points at infinity.
- Know the definition of a projective transformation (homography), and gain some geometric intuition for what it represents in 2D.
- Know how to find a least-squares best-fit transformation for:
- translation
- affine
# boilerplate setup
%load_ext autoreload
%autoreload 2
%matplotlib inline
import os
import sys
src_path = os.path.abspath("../src")
if (src_path not in sys.path):
sys.path.insert(0, src_path)
# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2
# codebase imports
import util
import filtering
import features
import geometry
Plan¶
- What can we do with linear, rigid, and affine transformations?
- Train tracks example
- We need something else. No messing with the third row. Mess with the third row!
- Homogeneous coordinates redux: normalization and equivalence.
- HW 4
- Homogeneous: interpretation in 2D
- Homogeneous interpretation in faux-3D
- HW5-7 How many DOF does a homography have?
- Break, if not already
- Solving for the best model given noisy correspondences: translations revisited
- /shrug average
- Linear algebra setup: minimize squared residuals
- Affine transformations: massaging into $Ax = b$ form
Context: Panorama Stitching Overview¶
- Detect features - Harris corners
- Describe features - MOPS descriptor
- Match features - SSD + ratio test
- Estimate motion model from correspondences
- Translation (we hand-waved this)
- Affine
- Projective
- We need new math for this! (our first task today)
- Robustness to outliers - RANSAC
- Warp image(s) into common coordinate system and blend
- Inverse warping (so far we've used
geometry.warp
) - Blending
- Inverse warping (so far we've used
Panorama Stitching: Example Cases¶
Question: what geometric relationships can translations model? How about affine?
Homework Problems 1, 3¶
Consider the following geometric properties. The first several questions ask which of them are preserved by a given class of geometric transformations. For each question, give a list of letters corresponding to which properties are preserved (i.e., left unchanged) by the given class of geometric transformations.
Note: Feel free to use this online demo to gain intuition and try things out as you work through these, keeping in mind that affine transformations are only those where the last row remains $\begin{bmatrix}0 & 0 & 1\end{bmatrix}$.
- Which are preserved under translation?
- Which are preserved under affine transformations?
A. Line straightness
B. Line lengths
C. Ratios of lengths along a line
D. Parallelism of lines
E. Angles
F. Locations of points
G. Location of the origin
Claim: Affine translations can't model a typical panorama sequence well beyond a certain field of view.
Justification: Let's talk about train tracks.
What would you see if you changed your view angle to point straight down towards the tracks?
plt.imshow(imageio.imread("../data/tracks2.png"))
<matplotlib.image.AxesImage at 0x7f28c4b32f10>
plt.imshow(imageio.imread("../data/atrium3.png"))
<matplotlib.image.AxesImage at 0x7fa04f47a850>
Implication: we need a new, more general class of geometric transformations.
Scott: don't mess with the third row!
Mayla: what if we mess with the third row?
Scott: Well, okay. You asked. Online demo.
Whiteboard: what does the third coordinate mean?
- These 3-vectors represent 2D points. Each point now has an equivalence class of 3-vectors equivalent up to scale.
- 2D interpretation - points can move around on a plane but project back to a line
- 3D interpretation - points can move around in 3-space but project back to a plane.
How many Degrees of Freedom does each of these transformations have?
- Translation?
- Affine?
- Projective (aka Homography?)
Homework Problems 5-6¶
In the next two problems, we'll show that a homography has only 8 degrees of freedom. In other words, for any homography $H$ that is not all zeros, there is a homography $H' = H / h_{33}$ that has the same effect on homogeneous coordinates.
- Let $ H = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & k \\ \end{bmatrix}$ and $\mathbf{x} = \begin{bmatrix}x\\ y \\ 1\end{bmatrix}$. Compute $H\mathbf{x}$ and normalize the resulting homogeneous point.
- Compute $\frac{1}{k}H\mathbf{x}$ and normalize the resulting homogeneous point.
Context: Panorama Stitching Overview¶
- Detect features - Harris corners
- Describe features - MOPS descriptor
- Match features - SSD + ratio test
- Estimate motion model from correspondences (next up!)
- Translation (we hand-waved this)
- Affine
- Projective
- We now have new math for this!
- Robustness to outliers - RANSAC
- Warp image(s) into common coordinate system and blend
- Inverse warping (so far we've used
geometry.warp
) - Blending
- Inverse warping (so far we've used
We know how to model a geometric transformation. How do we fit that model given correspondences?
Basic version: Depends on Degrees of Freedom!
- Translation - 2 DOF, requires 1 pair of corresponding points
- Affine - 6 DOF, requires 3 pairs of corresponding points
- Projective (aka Homography?) - 8 DOF, requires 4 pairs of corresponding points.
But usually we'll have more than that, and they won't all agree. We hand-waved the translation case by saying "let's average them, that seems sensible!" but it's not as clear what's sensible for fitting affine and projective.
Problem Statement¶
Given a set of (imperfect) feature matches, how do I find the optimal transformation that relates the two images?
What's optimal? Our definition: minimize the sum of squared residuals: $$ \min_T \sum_i||(T\mathbf{p}_i - \mathbf{p}_i')||^2 $$
Our optimistic hope (which will be mostly substantiated): this ends up being linear and can be solved with linear least squares.
In other words, we're looking to turn the minimization of the sum of squared residuals $$ \min_T \sum_i||(T\mathbf{p}_i - \mathbf{p}_i')^2|| $$ into a minimization of a linear least squares system $$ \min_\mathbf{x} ||A\mathbf{x} - \mathbf{b}|| $$
Solution approach for all three models (translation, affine, projective):
- Write down the residuals.
- Massage into $||A\mathbf{x} - \mathbf{b}||$ form.
- Solve