L06_476

Lecture 6¶

Announcements¶

Faculty candidate talks coming up!
- Alli Nilles
  - Tuesday 1/30 4pm CF 316 Teaching Demo: Introduction to the Branch-and-Bound Approach
- Brad McCoy (Computational Topology)
  - Thursday 2/1 4pm CF 105 Research Talk: An Invitation to Computational Topology
  - Friday 2/2 4pm CF 316 Teaching Demo: Dynamic Programming and Edit Distance

Goals¶

Analyze the invariances provided (and not provided) by the Harris corner detector
Understand the mathematical framework for (linear) geometric transformations on images (image warping).
Know what is possible with 2D linear transformations (scale, shear, rotation)
Understand the motivation and math behind homogeneous coordinates.
Know what is possible with 2D affine transformations (all of the above, plus translation)
Understand the concept of invariance as pertains to feature detectors and feature descriptors
Know the how and why of the MOPS feature descriptor

Throwback Tuesday: Notes on the Laplacian pyramid down/up weirdness:¶

On the way down, we do:

$$L_i = G_i - blur(G_i)$$ $$G_{i+1} = sub(blur(G_i))$$

On the way back up, we do: $$G_i = up(G_{i+1}) + L_i$$

We'd like for these to be inverses; in other words, if we rearrange the "up" to solve for $L_i$, they should be equal:

$$ \textrm{(downward)\hspace{2em} } G_i - up(G_{i+1}) \stackrel{?}{=} G_i - blur(G_i) \textrm{\hspace{2em}(upward)} $$ Cancel some terms: $$ \textrm{(downward)\hspace{2em} } up(G_{i+1}) \stackrel{?}{=} blur(G_i) \textrm{\hspace{2em}(upward)} $$

But if we plug in the downward definition of $G_{i+1}$ on the left, we get: $$ \textrm{(downward)\hspace{2em} } up(sub(blur(G_i))) \stackrel{?}{=} blur(G_i) \textrm{\hspace{2em}(upward)} $$

This would be true if $up$ and $sub$ were perfect inverses on a blurred image, but they're not.

We can fix this by changing the downward-direction rule; instead of $$L_i = G_i - blur(G_i)$$ we will replace $blur(G_i)$ with the upsampled version of $G_{i+1}$ that we'll actually have on the way back up: $$L_i = G_i - up(G_{i+1})$$ $$L_i = G_i - up(sub(blur(G_i))$$

In [1]:

# boilerplate setup
%load_ext autoreload
%autoreload 2

%matplotlib inline

import os
import sys

src_path = os.path.abspath("../src")
if (src_path not in sys.path):
    sys.path.insert(0, src_path)

# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2

# codebase imports
import util
import filtering
import features
import geometry

Plan¶

Harris Invariances
2D linear tx
Break
Affine tx
MOPS descriptor

Good Features:¶

What makes good features?

Uniqueness: features shouldn't match if they're from different points in the scene.  
Invariance: features should match if they do come from the same point in the scene.

With Harris corners, we achieved local uniqueness. For now, we're going to punt on global uniqueness until later.

Invariance depends on both the feature points selected and on how you describe them. Let's look at the detection side first.

What do we want to be invariant to? Various things that might change from one image to another.

Invariance: Easy Mode¶

In [2]:

plt.figure(figsize=((10, 5)))
plt.imshow(imageio.imread("../data/L06/trevi_easy.jpg"))

Out[2]:

<matplotlib.image.AxesImage at 0x7feff8b25350>

No description has been provided for this image

Invariance: Hard Mode¶

In [3]:

plt.figure(figsize=((10, 5)))
plt.imshow(imageio.imread("../data/L06/trevi_hard.jpg"))

Out[3]:

<matplotlib.image.AxesImage at 0x7feff8b5b2d0>

Invariance: Mars Mode¶

In [4]:

plt.figure(figsize=((10, 5)))
plt.imshow(imageio.imread("../data/L06/mars2.jpg"))

Out[4]:

<matplotlib.image.AxesImage at 0x7feff8bcea90>

Things we might want to be invariant to (an aspirational list):

Geometric transformations
- Translation
- Scale
- Rotation
- Viewpoint change / geometry change
Photometric transformations
- Color/intensity shift/scale
- Contrast adjustment

Others that aren't clearly one of the above: motion blur, noise, occlusion, ...

dx = convolve(img, sobel_x)
dy = convolve(img, sobel_y)
A = separable_filter(dx * dx, gauss1d5)
B = separable_filter(dx * dy, gauss1d5)
C = separable_filter(dy * dy, gauss1d5)

det = A*C - B*B
tr = A+C

m = tr / 2
p = det
sqrtm2mp = np.sqrt(m**2 - p)
eig1 = m - sqrtm2mp
eig2 = m + sqrtm2mp
return np.minimum(eig1, eig2)```

Homework Problem 1¶

Comment on whether (or the extent to which) the Harris corner detector is robust to each of the following transformations. In other words, which of these will not affect which points will be found as corners by the Harris detector? If the detector is almost but not quite completely robust to a given change, comment on this. Assume edge effects and intensity clipping are not an issue.

Intensity shift: $I(x, y)' = I(x, y) + 20$
Intensity scale: $I(x, y)' = 1.2 I(x, y)$
Scaling: $I(x, y)' = I(0.5x, 0.5y)$
Translation $I(x, y)' = I(x - 10, y)$
Rotation

In [ ]:

A Mathematical Digression: Geometric Transformations¶

Notice that I ran out of math in the last subproblem above - how do I write this down? We've reached the point where it becomes nicer to talk about geometric transformations using the language of linear algebra.

Whiteboard linear algebra review:

A 2x2 matrix is a mapping from points to points
You can interpret it two ways:
- Where does it send a point?
- What coordinate system does it translate points from? This is the change-of-basis vew.
Transformations can compose (they are linear!). If you let M = AB and apply it to Mx, then B happens first, then A!
- Sometimes this matters, since matrix multiplication doesn't commute.

In [5]:

import geometry
ferris = imageio.imread("../data/L06/tx.png").astype(np.float32) / 255
ferris = skim.transform.rescale(ferris[:,:,:3], (0.5, 0.5, 1))

H, W = ferris.shape[:2]
util.imshow_truesize(ferris)

In [6]:

def warp_ferris(tx, **kwargs):
    fw = geometry.warp(ferris, M, **kwargs)
    util.imshow_truesize(fw)
    # plt.gca().set_axis_off()

M = np.array([
    [1.0, 0.0],
    [0.0, 1.0]
], dtype=np.float32)

warp_ferris(M, dsize=(W, H))

Homework Problems 2-5¶

Important note: the coordinate system conventions here differ from the homework! Here, (0, 0) is at the top left and y points down In the HW visualizations, the origin is at the bottom left and y points up.

Scale uniformly by a factor of 1.3:

In [7]:

M = np.array([
    [1.3, 0.0],
    [0.0, 1.3]
], dtype=np.float32)

warp_ferris(M, dsize=(int(W*1.3), int(H*1.3)), bottom_left_origin=True)

Scale by a factor of 2 in only the $x$ direction:

In [8]:

M = np.array([
    [2.0, 0.0],
    [0.0, 1.0]
], dtype=np.float32)

warp_ferris(M, dsize=(W*2, H))

Skew or shear the image so the top row of pixels is shifted over by 1/4 of the image's height.

In [9]:

M = np.array([
    [1.0, 0.25],
    [0.0, 1.0]
], dtype=np.float32)

warp_ferris(M, dsize=(int(W*1.25), H))

Rotate the image counter-clockwise by 30 degrees:

Rotation matrices: to rotate $\theta$ counter-clockwise, the matrix is: $$ \begin{bmatrix} \cos \theta & \sin \theta \\ -\sin \theta & \cos \theta \\ \end{bmatrix} $$

In [10]:

rad30 = np.radians(30)

M = np.array([
    [np.cos(rad30), np.sin(rad30)],
    [-np.sin(rad30), np.cos(rad30)]
], dtype=np.float32)

warp_ferris(M, dsize=(int(W), int(H*1.5)))

Translate the image up and right by (say) 40 pixels:

In [ ]:

Scott is the worst. This one's impossible!

Proof: no matrix can move (0, 0) to anywhere but (0, 0).

Hack: add a column! This is the same hack as the bias trick from ML.

Whiteboard: Affine transformations

Just like 2x2 linear transformations can be seen as a change of basis, affine can be seen as a change of frame, where the origin may also move. This will come back to haunt us later!

Back to the Features!¶

For our feature descriptors, we agreed that a patch of pixels is more unique than a single pixel value. Let's start there, and add invariances to the following:

Rotation
(some) intensity shifts and scales ($I' = aI + b$, sort of)
Scale

The MOPS Feature descriptor¶

Multi-scale oriented patches

Rotation: rotate the gradient direction to angle 0 (horizontal, presumably) Intensity: subtract mean and divide by standard deviation Scale: run it on a Gaussian pyramid

But how do we even extract a patch in the first place?

(whiteboard: patch extraction transformation)

In [11]:

import features

h = imageio.imread("../data/harris_crop.jpg")
h = skim.color.rgb2gray(h.astype(np.float32) / 255)

corner_mask = features.harris_corners(h, 0.1)
plt.imshow(corner_mask)
features.get_harris_points(corner_mask).T

Out[11]:

array([[  1,   1],
       [  1, 106],
       [ 18, 106],
       [ 19,   7],
       [ 21,   2],
       [ 29,  22],
       [ 38, 106],
       [ 39,  10],
       [ 43,   1],
       [ 47, 105],
       [ 50,   7],
       [ 52,   2],
       [ 59,  32],
       [ 59,  84],
       [ 59, 106],
       [ 67, 104],
       [ 68,  86],
       [ 70,  29],
       [ 73,  46],
       [ 73,  72],
       [ 74,  40],
       [ 74,  56],
       [ 74,  61],
       [ 74,  78],
       [ 79,  40],
       [ 79,  56],
       [ 79,  61],
       [ 79,  73],
       [ 79,  78],
       [113,   1],
       [113,  31],
       [113,  37],
       [113,  43],
       [113,  59],
       [113,  75],
       [113,  81],
       [113, 106]])

In [12]:

util.imshow_gray(h)

In [13]:

y, x = [59,  32]
util.imshow_gray(features.extract_MOPS(h, (y, x))) 
plt.colorbar()

Out[13]:

<matplotlib.colorbar.Colorbar at 0x7fefdc5cd210>