Get a feel for some of the image processing operations that can and can't be accomplished using linear filtering.
Know the properties of the convolution and cross-correlation, and some of their implications:
Syllabus questions
Revisiting image formation: image noise
Noise as motivation for neighborhood operations
Live code: mean filter
Edge handling choices:
Write down the discrete math definition; aside: write down the continuous definition
Live code: generalize to cross-correlation with filter weights
Mess with the filter weights:
Exercise: compute a small cross-correlation result
Break
More filter weights:
Convolution vs cross-correlation
Properties - shift invariance; linearity; commutativity; associativity
Filter composition using associativity
Tricks using these properties:
# boilerplate setup
%load_ext autoreload
%autoreload 2
import os
import sys
src_path = os.path.abspath("../src")
if (src_path not in sys.path):
sys.path.insert(0, src_path)
# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2
# codebase imports
import util
import filtering
Let's make beans a little noisy.
# don't worry about the details here, we're just adding noise to beans
beans = imageio.imread("../data/beans.jpg")
bg = skim.color.rgb2gray(beans)
bg = skim.transform.rescale(bg, 0.25, anti_aliasing=True)
bn = bg + np.random.randn(*bg.shape) * 0.05
plt.imshow(bn, cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d89d7090>
If each pixel measurement is corrupted, how can we improve our guess at what the "ideal" image should have been?
Idea: mean filter
Live-pseudocode:
def mean_filter(img, filter_size):
""" Apply a square spatial mean filter with side length filter_size
to a grayscale img. Preconditions:
- img is a grayscale (2d) float image
- filter_size is odd """
H, W = img.shape
out = np.zeros_like(img)
hw = filter_size // 2
for i in range(H):
for j in range(W):
total = 0.0
for ioff in range(-hw, hw+1):
for joff in range(-hw, hw+1):
total += img[i + ioff, j + joff]
out[i,j] = total / filter_size**2
# Look at mean_filter
plt.imshow(filtering.mean_filter(bn, 5), cmap="gray")
<matplotlib.image.AxesImage at 0x7f76f183c750>
(whiteboard)
Edge handling:
Output size:
For anything but valid, how do you handle when the filter hangs over the void?
Another example
sticks = skim.color.rgb2gray(imageio.imread("sticks.png"))
plt.imshow(sticks, cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d8aae910>
plt.imshow(filtering.mean_filter(sticks, 9), cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d8a0e910>
See the lattice-like artifacts? Ick. Why is this happening?
Alternatives?
Here's one:
plt.imshow(cv2.GaussianBlur(sticks, ksize=(9,9), sigmaX=2), cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d2f1e910>
TODO: implement filtering.filter
# test out the case of the mean filter we've been using
f = np.ones((9,9)) / (9*9)
plt.imshow(filtering.filter(sticks, f), cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d80d75d0>
# don't look behind the curtain!
g = np.zeros((9, 9))
g[4,4] = 1
g = cv2.GaussianBlur(g, ksize=(9, 9), sigmaX=2)
g /= g.sum()
plt.imshow(filtering.filter(sticks, g), cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d8134750>
plt.imshow(f, cmap="gray")
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x7f76d29d1c90>
plt.imshow(g, cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d2820890>
where:
Practicalities when calculating this in real life:
(1) $f \otimes w$ indicates the cross-correlation of image $f$ with filter $w$. Compute the following cross-correlation using same output size and zero padding. $$ \begin{bmatrix} 0 & 1 & 0\\ 0 & 1 & 0\\ 0 & 1 & 0 \end{bmatrix} \otimes \begin{bmatrix} 1 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 1 \end{bmatrix} $$
(2) Perform the same convolution as above, but use repeat padding.
(3) Perform the same convolution as above, but use valid output size.
(4) Describe in words the result of applying the following filter using cross-correlation. If you aren't sure, try applying it to the image above to gain intuition.
$$ \begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 1\\ 0 & 0 & 0 \end{bmatrix} $$Let's mess with filter weights to do weird stuff.
h = np.array([
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1]],dtype=np.float64) / 5
plt.imshow(np.hstack([sticks, filtering.filter(sticks, h)]), cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d235f910>
h = np.array([
[0, 0, 0, 0, 1],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 0],
[0, -1, 0, 0, 0],
[-1, 0, 0, 0, 0]], dtype=np.float64)
plt.imshow(filtering.filter(bg, h), cmap="gray")
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x7f76d25f9c90>
Discrete cross-correlation:
$$ (f \otimes g)(x, y) = \sum_{j=-\ell}^\ell \sum_{k=-\ell}^\ell f(x+j, y+k) * g(j, k) $$Note that $g$ is defined with (0, 0) at the center.
Turns out there's a continuous version of this too! Sums become integrals:
$$ (f \otimes g)(x, y) = \int_{j=-\infty}^\infty \int_{k=-\infty}^\infty f(x+j, y+k) * g(j, k) $$Why $\infty$? Assume zero outside the boundaries.
Convolution vs cross-correlation
Properties
Aside: The filter above (with just a 1 in the middle is called the identity filter.
A small modification to cross-correlation yields Convolution:
Cross-correlation ($\otimes$): $$ (f \otimes g)(x, y) = \sum_{j=-\ell}^\ell \sum_{k=-\ell}^\ell f(x+j, y+k) * g(j, k) $$ Convolution ($*$): $$ (f * g)(x, y) = \sum_{j=-\ell}^\ell \sum_{k=-\ell}^\ell f(x-j, y-k) * g(j, k) $$
This effectively flips the filter horizontally and vertically before applying it, and gains us commutativity.
orig = bg
blurred = filtering.filter(bg, g)
lost = orig - blurred
plt.imshow(orig, cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d23d7dd0>
plt.imshow(blurred, cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d23982d0>
plt.imshow(lost, cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d2260890>
What if we add back what's lost?
plt.imshow(np.hstack([orig, orig + lost]), cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d22ea910>
(whiteboard - equations here for posterity) \begin{align*} I' &= I + (I - (I * G))\\ &= (I + I) - (I * G))\\ &= (I * D) - (I * G)\\ &= I * (D - G)\\ \end{align*}
Visual intuition:
d = np.zeros_like(g)
d[4, 4] = 2
sharp = d - g
sharpened = filtering.filter(bg, sharp)
plt.imshow(np.hstack([orig, sharpened]), cmap="gray")
<matplotlib.image.AxesImage at 0x7f76d22d35d0>
Homework Problems 9-10:
Compute a 3x3 filter by convolving the following $1 \times 3$ filter with its transpose using full output size and zero padding: $$ \begin{bmatrix} 1 & 2 & 1\\ \end{bmatrix} $$
Suppose you have an image $F$ and you want to apply a $3 \times 3$ filter $H$ like the one above that can be written as $H = G * G^T$, where $G$ is $1 \times 3$. Which of the following will be more efficient?
Homework Problem 11:
(11) For each of the following, decide whether it's possible to design a convolution filter that performs the given operation.
Threshold: the output pixel is