L07_476

Lecture 7¶

Announcements¶

Last two faculty candidate talks! Brad McCoy (Computational Topology)
- Thursday 2/1 4pm CF 105 Research Talk: An Invitation to Computational Topology
- Friday 2/2 4pm CF 316 Teaching Demo: Dynamic Programming and Edit Distance
Week 3 Survey - themes
- Notebook presentation is mostly working for folks
- TR class schedule is neutral to good
- One comment I think is relevant: there are no quizzes or anything; how do you know how you're doing?
Reminder: if you submit homework problems late or resubmit, you need to email me if you want me to look at it; please include a changelog (or separate out the problems you want regraded)

Goals¶

Know the how and why of the MOPS feature descriptor
Know how and why to match features using:
- The SSD metric
- The ratio test
Be able to implement a barebones translational image alignment pipeline.

In [1]:

# boilerplate setup
%load_ext autoreload
%autoreload 2

%matplotlib inline

import os
import sys

src_path = os.path.abspath("../src")
if (src_path not in sys.path):
    sys.path.insert(0, src_path)

# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2

# codebase imports
import util
import filtering
import features
import geometry

Plan¶

Finish up MOPS description
Feature matching: SSD, ratio distance
Implement a barebones translational image alignment pipeline

Panorama Stitching Overview¶

Detect features
Describe features
Match features
Estimate motion model
Warp image(s) into common coordinate system and blend

Finish up MOPS implementation¶

576: fill in matrices and do intensity standardization
476: do intensity standardization

Feature Matching¶

So you have a pile of feature descriptors across 2 images - how do we compare them?

In [ ]:

Simplest metric choice: SSD = sum of squared differences (we used this in the Harris patch error metric)

$$ SSD(f, g) = \sum_{i=1}^d (f_i - g_i)^2$$

In [ ]:

# implement features.ssd

Okay, we can compare 2 feautes; how do we find matches?

Simplest answer: brute force!

Homework Problems 1-2¶

In [3]:

F1 = np.array([
    [0, 1, 4, 3],
    [1, 0, 4, 1]], dtype=np.float32)

F2 = np.array([
    [2, 5, 1],
    [1, 5, 2]], dtype=np.float32)

Homework Problems 1-2¶

You can now do these with code or by hand:

Create a table with 4 rows and 3 columns in which the $(i,j)$th cell contains the SSD distance between feature $i$ in F1 and feature $j$ in F2.
For each feature in F1, give the index of the closest feature match in F2 using the SSD metric.

Note: The homework problem asks for 1-indexed indices, so don't forget to add 1 if you're coding this.

In [8]:

d, n1 = F1.shape
n2 = F2.shape[1]

distances = np.zeros((4, 3))
for i in range(n1):
    for j in range(n2):
        distances[i,j] = features.ssd(F1[:,i], F2[:,j])

print(distances)
print(np.argmin(distances, axis=1)+1)
# TODO

[[ 4. 41.  2.]
 [ 2. 41.  4.]
 [13.  2. 13.]
 [ 1. 20.  5.]]
[3 1 2 1]

Optimized brute force: scipy.spatial.distance.cdist vectorizes the brute force for you; we're using the sqeuclidean distance.

You can get fancy by using spatial data structures like kd-trees, etc

This is basically fast nearest neighbor search; so hot right now
Fast approximate nearest neighbor search is also a big thing, but may not be great for us (you'll see why soon)

Algorithm to get a list of feature correspondences:¶

Algorithm:
- foreach feature in 1
- Take the closest feature in 2

Problem: the closest thing may not be very close

Solution: threshold!

Algorithm, take 2:
- foreach feature in 1
- Take the closest feature in 2
- add them to a list of correspondences if their distance is less than threshold

The Fence Strikes Back¶

Insight: good matches are ones where the closest thing isn't just barely the closest thing.

Idea: look at the second closest match. Specifically, if $g_1$ and $g_2$ are the closest and second closest matches in image 2 to $f$ in image 1, then

$$ d_{\mathrm{ratio}} = \frac{SSD(f, g1)}{SSD(f, g2)} $$

What does this equal if $g1$ and $g2$ are equally far from $f$?

How does it behave as $g_2$ gets increasingly far from $f$ compared to $g_1$?

HW Problem 3¶

Again, you can do this by hand or with code:

F1 =[0, 1, 4, 3],
    [1, 0, 4, 1]], dtype=np.float32)

F2 = np.array([
    [2, 5, 1],
    [1, 5, 2]], dtype=np.float32)

For each feature in F2, give the index of the closest feature match in F1 and the ratio distance between each feature and its closest match.

In [10]:

np.argmin(distances, axis=0)+1

Out[10]:

array([4, 3, 1])

[[ 4. 41.  2.]
 [ 2. 41.  4.]
 [13.  2. 13.]
 [ 1. 20.  5.]]

We have (almost all) the pieces¶

of an end-to-end image stitching pipeline. Two missing bits:

How do we model the motion between two images?
How do we get the images into a common coordinate system and blend them?

Bit 1:¶

For some simple cases (specifically: long focal length cameras), a very good approximation the motion from one image in a panorama sequence to another is a simple translation. That is, the image can be (pretty much) aligned by simply offseting all the pixels by some amount in $x$ and $y$.

Brainstorm: Given a list of feature correspondences, how could we estimate a single translation model that fits them as well as possible?

In [ ]:

Sensible-seeming approach: average the displacements! If $(\mathbf{f}_i, \mathbf{g}_i)$ are corresponding feature pairs in image 1 and 2, then

$$\mathbf{t} = \frac{1}{n} \sum_{i=1}^n (\mathbf{f}_i - \mathbf{g}_i)$$

We'll return to this later and find that this not only seems sensible, but is in fact, a principled approach for a translational motion model.

Bit 2:¶

Build an affine transformation matrix and warp image 2 into an image that's large enough to fit its extent.

Homework Problems 4-5¶

Give a 3x3 affine transformation matrix that can be used to warp image 2 into image 1's coordinates.
If image 1's origin is at its top left and $t_x$ and $t_y$ are both positive, what's the size of the destination image that can contain the combined image?

Can we code this up in the time remaining?¶

(I have no idea whether this is doable, or if it will work well)

TODO:

find Harris keypoints (use features.harris_corners then features.get_harris_points)
extract descriptors (use features.extract_MOPS)
implement features.compute_distances (use features.ssd and for loops, or use scipy.spatial.distance.cdist)
implement features.get_matches to find closest matches and threshold by match score
- For SSD, np.argmin should do
- For ratio distance, np.argsort is probably where it's at
implement geometry.estimate_translation to average the differences between correspondences
build an affine transformation matrix that applies that translation
warp image 2 into a new image that's large enough to fit both (use geometry.warp)
add (average? blend somehow?) image 1 into the warped image 2

If you get this working (i.e., running - it may not find a good model!), then try making it multi-scale:

compute a Gaussian pyramid (implement filtering.gaussian_pyramid)
adjust the above feature detection, extraction, and transformation estimation steps to account for multiple scales

In [ ]:

# here are two images where a translational alignment should work
y1 = imageio.imread("../data/yos1.jpg").astype(np.float32) / 255
y1 = skim.color.rgb2gray(y1)

y2 = imageio.imread("../data/yos2.jpg").astype(np.float32) / 255
y2 = skim.color.rgb2gray(y2)

util.imshow_gray(y1)