Lecture 7¶
Announcements¶
- Last two faculty candidate talks! Brad McCoy (Computational Topology)
- Thursday 2/1 4pm CF 105 Research Talk: An Invitation to Computational Topology
- Friday 2/2 4pm CF 316 Teaching Demo: Dynamic Programming and Edit Distance
- Week 3 Survey - themes
- Notebook presentation is mostly working for folks
- TR class schedule is neutral to good
- One comment I think is relevant: there are no quizzes or anything; how do you know how you're doing?
- Reminder: if you submit homework problems late or resubmit, you need to email me if you want me to look at it; please include a changelog (or separate out the problems you want regraded)
Goals¶
- Know the how and why of the MOPS feature descriptor
- Know how and why to match features using:
- The SSD metric
- The ratio test
- Be able to implement a barebones translational image alignment pipeline.
# boilerplate setup
%load_ext autoreload
%autoreload 2
%matplotlib inline
import os
import sys
src_path = os.path.abspath("../src")
if (src_path not in sys.path):
sys.path.insert(0, src_path)
# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2
# codebase imports
import util
import filtering
import features
import geometry
Plan¶
- Finish up MOPS description
- Feature matching: SSD, ratio distance
- Implement a barebones translational image alignment pipeline
Panorama Stitching Overview¶
- Detect features
- Describe features
- Match features
- Estimate motion model
- Warp image(s) into common coordinate system and blend
Finish up MOPS implementation¶
- 576: fill in matrices and do intensity standardization
- 476: do intensity standardization
Feature Matching¶
So you have a pile of feature descriptors across 2 images - how do we compare them?
Simplest metric choice: SSD = sum of squared differences (we used this in the Harris patch error metric)
$$ SSD(f, g) = \sum_{i=1}^d (f_i - g_i)^2$$
# implement features.ssd
Okay, we can compare 2 feautes; how do we find matches?
Simplest answer: brute force!
Homework Problems 1-2¶
F1 = np.array([
[0, 1, 4, 3],
[1, 0, 4, 1]], dtype=np.float32)
F2 = np.array([
[2, 5, 1],
[1, 5, 2]], dtype=np.float32)
Homework Problems 1-2¶
You can now do these with code or by hand:
Create a table with 4 rows and 3 columns in which the $(i,j)$th cell contains the SSD distance between feature $i$ in F1 and feature $j$ in F2.
For each feature in F1, give the index of the closest feature match in F2 using the SSD metric.
Note: The homework problem asks for 1-indexed indices, so don't forget to add 1 if you're coding this.
d, n1 = F1.shape
n2 = F2.shape[1]
distances = np.zeros((4, 3))
for i in range(n1):
for j in range(n2):
distances[i,j] = features.ssd(F1[:,i], F2[:,j])
print(distances)
print(np.argmin(distances, axis=1)+1)
# TODO
[[ 4. 41. 2.] [ 2. 41. 4.] [13. 2. 13.] [ 1. 20. 5.]] [3 1 2 1]
Optimized brute force: scipy.spatial.distance.cdist vectorizes the brute force for you; we're using the sqeuclidean
distance.
You can get fancy by using spatial data structures like kd-trees, etc
- This is basically fast nearest neighbor search; so hot right now
- Fast approximate nearest neighbor search is also a big thing, but may not be great for us (you'll see why soon)
Algorithm to get a list of feature correspondences:¶
- Algorithm:
- foreach feature in 1
- Take the closest feature in 2
Problem: the closest thing may not be very close
Solution: threshold!
- Algorithm, take 2:
- foreach feature in 1
- Take the closest feature in 2
- add them to a list of correspondences if their distance is less than threshold
The Fence Strikes Back¶
Insight: good matches are ones where the closest thing isn't just barely the closest thing.
Idea: look at the second closest match. Specifically, if $g_1$ and $g_2$ are the closest and second closest matches in image 2 to $f$ in image 1, then
$$ d_{\mathrm{ratio}} = \frac{SSD(f, g1)}{SSD(f, g2)} $$
What does this equal if $g1$ and $g2$ are equally far from $f$?
How does it behave as $g_2$ gets increasingly far from $f$ compared to $g_1$?
HW Problem 3¶
Again, you can do this by hand or with code:
F1 =[0, 1, 4, 3],
[1, 0, 4, 1]], dtype=np.float32)
F2 = np.array([
[2, 5, 1],
[1, 5, 2]], dtype=np.float32)
For each feature in F2, give the index of the closest feature match in F1 and the ratio distance between each feature and its closest match.
np.argmin(distances, axis=0)+1
array([4, 3, 1])
[[ 4. 41. 2.]
[ 2. 41. 4.]
[13. 2. 13.]
[ 1. 20. 5.]]
We have (almost all) the pieces¶
of an end-to-end image stitching pipeline. Two missing bits:
How do we model the motion between two images?
How do we get the images into a common coordinate system and blend them?
Bit 1:¶
For some simple cases (specifically: long focal length cameras), a very good approximation the motion from one image in a panorama sequence to another is a simple translation. That is, the image can be (pretty much) aligned by simply offseting all the pixels by some amount in $x$ and $y$.
Brainstorm: Given a list of feature correspondences, how could we estimate a single translation model that fits them as well as possible?
Sensible-seeming approach: average the displacements! If $(\mathbf{f}_i, \mathbf{g}_i)$ are corresponding feature pairs in image 1 and 2, then
$$\mathbf{t} = \frac{1}{n} \sum_{i=1}^n (\mathbf{f}_i - \mathbf{g}_i)$$
We'll return to this later and find that this not only seems sensible, but is in fact, a principled approach for a translational motion model.
Bit 2:¶
Build an affine transformation matrix and warp image 2 into an image that's large enough to fit its extent.
Homework Problems 4-5¶
Give a 3x3 affine transformation matrix that can be used to warp image 2 into image 1's coordinates.
If image 1's origin is at its top left and $t_x$ and $t_y$ are both positive, what's the size of the destination image that can contain the combined image?
Can we code this up in the time remaining?¶
(I have no idea whether this is doable, or if it will work well)
TODO:
- find Harris keypoints (use
features.harris_corners
thenfeatures.get_harris_points
) - extract descriptors (use
features.extract_MOPS
) - implement
features.compute_distances
(usefeatures.ssd
and for loops, or usescipy.spatial.distance.cdist
) - implement
features.get_matches
to find closest matches and threshold by match score- For SSD,
np.argmin
should do - For ratio distance,
np.argsort
is probably where it's at
- For SSD,
- implement
geometry.estimate_translation
to average the differences between correspondences - build an affine transformation matrix that applies that translation
- warp image 2 into a new image that's large enough to fit both (use
geometry.warp
) - add (average? blend somehow?) image 1 into the warped image 2
If you get this working (i.e., running - it may not find a good model!), then try making it multi-scale:
- compute a Gaussian pyramid (implement
filtering.gaussian_pyramid
) - adjust the above feature detection, extraction, and transformation estimation steps to account for multiple scales
# here are two images where a translational alignment should work
y1 = imageio.imread("../data/yos1.jpg").astype(np.float32) / 255
y1 = skim.color.rgb2gray(y1)
y2 = imageio.imread("../data/yos2.jpg").astype(np.float32) / 255
y2 = skim.color.rgb2gray(y2)
util.imshow_gray(y1)