Lecture 2

Announcements

Goals

Lesson Plan

Syllabus questions

Revisiting image formation: image noise

Noise as motivation for neighborhood operations

Live code: mean filter

Edge handling choices:

Write down the discrete math definition; aside: write down the continuous definition

Live code: generalize to cross-correlation with filter weights

Mess with the filter weights:

Exercise: compute a small cross-correlation result

Break

More filter weights:

Convolution vs cross-correlation

Properties - shift invariance; linearity; commutativity; associativity

Filter composition using associativity

Tricks using these properties:

Image Formation Revisited

How do we record light intensity inside a camera?

Let's make beans a little noisy.

Brainstorm

If each pixel measurement is corrupted, how can we improve our guess at what the "ideal" image should have been?

Idea: mean filter

Live-pseudocode:

def mean_filter(img, filter_size):
    """ Apply a square spatial mean filter with side length filter_size
    to a grayscale img. Preconditions:
      - img is a grayscale (2d) float image
      - filter_size is odd """
    H, W = img.shape

    out = np.zeros_like(img)

    hw = filter_size // 2
    for i in range(H):
        for j in range(W):
            total = 0.0
            for ioff in range(-hw, hw+1):
                for joff in range(-hw, hw+1):
                    total += img[i + ioff, j + joff]

            out[i,j] = total / filter_size**2

Design Decisions - edge handling

(whiteboard)

Edge handling:

Output size:

For anything but valid, how do you handle when the filter hangs over the void?

Is a mean filter the best we can do?

Another example

See the lattice-like artifacts? Ick. Why is this happening?

Alternatives?

TODO: implement filtering.filter

How would we calculate a Gaussian kernel ourselves?

$$G(x, y) = \frac{1}{2 \pi \sigma^2} e^{-\frac{x^2 + y^2}{2 \sigma^2}}$$

where:

Practicalities when calculating this in real life:

HW Problems 1 - 4

(1) $f \otimes w$ indicates the cross-correlation of image $f$ with filter $w$. Compute the following cross-correlation using same output size and zero padding. $$ \begin{bmatrix} 0 & 1 & 0\\ 0 & 1 & 0\\ 0 & 1 & 0 \end{bmatrix} \otimes \begin{bmatrix} 1 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 1 \end{bmatrix} $$

(2) Perform the same convolution as above, but use repeat padding.

(3) Perform the same convolution as above, but use valid output size.

(4) Describe in words the result of applying the following filter using cross-correlation. If you aren't sure, try applying it to the image above to gain intuition.

$$ \begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 1\\ 0 & 0 & 0 \end{bmatrix} $$

What else could we do with this?

Let's mess with filter weights to do weird stuff.

Break

Math definitions

Discrete cross-correlation:

$$ (f \otimes g)(x, y) = \sum_{j=-\ell}^\ell \sum_{k=-\ell}^\ell f(x+j, y+k) * g(j, k) $$

Note that $g$ is defined with (0, 0) at the center.

Turns out there's a continuous version of this too! Sums become integrals:

$$ (f \otimes g)(x, y) = \int_{j=-\infty}^\infty \int_{k=-\infty}^\infty f(x+j, y+k) * g(j, k) $$

Why $\infty$? Assume zero outside the boundaries.

Properties of Cross-Correlation

Convolution vs cross-correlation

Properties

Aside: The filter above (with just a 1 in the middle is called the identity filter.

Convolution

A small modification to cross-correlation yields Convolution:

Cross-correlation ($\otimes$): $$ (f \otimes g)(x, y) = \sum_{j=-\ell}^\ell \sum_{k=-\ell}^\ell f(x+j, y+k) * g(j, k) $$ Convolution ($*$): $$ (f * g)(x, y) = \sum_{j=-\ell}^\ell \sum_{k=-\ell}^\ell f(x-j, y-k) * g(j, k) $$

This effectively flips the filter horizontally and vertically before applying it, and gains us commutativity.

Okay, what can we do with these properties?

We can blur, but can we sharpen?

Key question: what did blurring remove?

$I - blur(I)$

What if we add back what's lost?

(whiteboard - equations here for posterity) \begin{align*} I' &= I + (I - (I * G))\\ &= (I + I) - (I * G))\\ &= (I * D) - (I * G)\\ &= I * (D - G)\\ \end{align*}

Visual intuition:

Efficiency trick: separable filters

Homework Problems 9-10:

  1. Compute a 3x3 filter by convolving the following $1 \times 3$ filter with its transpose using full output size and zero padding: $$ \begin{bmatrix} 1 & 2 & 1\\ \end{bmatrix} $$

  2. Suppose you have an image $F$ and you want to apply a $3 \times 3$ filter $H$ like the one above that can be written as $H = G * G^T$, where $G$ is $1 \times 3$. Which of the following will be more efficient?

    • $F * (G * G^T)$
    • $(F * G) * G^T$

What can't we do with convolution?

Homework Problem 11:

(11) For each of the following, decide whether it's possible to design a convolution filter that performs the given operation.