Feedback for P01 and P02 came back yesterday. I added a few notes on logistics to the syllabus, just so they're written down somewhere:
Project 1 is out today. We'll talk about it a bit in class on Thursday.
L??_476.ipynb
and L??_576.ipynb
# boilerplate setup
%load_ext autoreload
%autoreload 2
%matplotlib inline
import os
import sys
src_path = os.path.abspath("../src")
if (src_path not in sys.path):
sys.path.insert(0, src_path)
# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2
# codebase imports
import util
import filtering
Images are functions; differentiation is an operator. What does it mean?
Since they're functions of two variables, a single-valued output needs to be a partial derivative:
We have discrete (i.e., sampled) images, so we need to approximate this with finite differences. Let's design a convolution kernel that accomplishes this.
Whiteboard: calculus reminder
Consider the following two candidate horizontal derivative filters. $$ \begin{bmatrix} 1 & -1 & 0\\ \end{bmatrix} $$
$$ \begin{bmatrix} 1 & 0 & -1\\ \end{bmatrix} $$bikes = imageio.imread("../data/bikesgray.jpg").astype(np.float32) / 255.0
bikes = skim.transform.rescale(bikes, 0.5, anti_aliasing=True)
bikes = bikes + np.random.randn(*bikes.shape) * 0.05
util.imshow_gray(bikes)
dx = np.array([-1, 0, 1]).reshape((1, 3)) / 2
dy = dx.T
Look at filtering.py
filter
function to handle non-square kernels.convolve
that just flips the kernel then runs filter
bx = filtering.filter(bikes, dx)
by = filtering.filter(bikes, dy)
util.imshow_gray(np.hstack([bx, by]))
util.imshow_gray(np.hstack([bx[30:80, :50], by[30:80, :50]]))
Let's look at the intensities along a single scanline. This one is a vertical scanline that crosses the brick pattern.
plt.plot(bikes[:,100])
[<matplotlib.lines.Line2D at 0x7f0f805394d0>]
plt.plot(by[:,100])
[<matplotlib.lines.Line2D at 0x7f0f8054dd90>]
This motivates an idea: blur the noise so that the real edges stick out!
Why use 2 filters when you could use just 1?
Compute the following convolution using zero padding and describe the effect of the resulting kernel in words. $$ \begin{bmatrix} 1 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 1 \end{bmatrix} * \begin{bmatrix} 0 & 0 & 0\\ 1 & 0 & -1\\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} \ & \ & \ \\ \ & \ & \ \\ \hspace{1em} & \hspace{1em} & \hspace{1em} \end{bmatrix} $$
Let's check our work:
blur = np.array([
[1, 2, 1],
[2, 4, 2],
[1, 2, 1]], dtype=np.float32)
dx = np.array([
[0, 0, 0],
[1, 0, -1],
[0, 0, 0]], dtype=np.float32)
# check our answer
xsobel = filtering.convolve(blur, dx)
ysobel = xsobel.T
xsobel
array([[ 2., 0., -2.], [ 4., 0., -4.], [ 2., 0., -2.]], dtype=float32)
For whatever reason*, this is more often written scaled down by 1/2:
*Probable reason: if you do repeat padding, this is what you get!
xsobel /= 2
ysobel = xsobel.T
xsobel, ysobel
(array([[ 1., 0., -1.], [ 2., 0., -2.], [ 1., 0., -1.]], dtype=float32), array([[ 1., 2., 1.], [ 0., 0., 0.], [-1., -2., -1.]], dtype=float32))
bx = filtering.convolve(bikes, xsobel)
by = filtering.convolve(bikes, ysobel)
util.imshow_gray(np.hstack([bx, by]))
plt.plot(by[:,100])
Direction-independent edge detector? First pass: gradient magnitude
$$ \Delta f = \begin{bmatrix} \frac{\partial}{\partial x} f \\ \frac{\partial}{\partial y} f \end{bmatrix} $$Edge strength: gradient magnitude $||\Delta f||$
plt.imshow(np.sqrt(bx ** 2 + by**2), cmap="gray")
<matplotlib.image.AxesImage at 0x7f0f7cca1750>
This is useful enough that I wrote filtering.grad_mag
to make it simple to do.
util.imshow_gray(filtering.grad_mag(bikes))
Classical fancier method: Canny Edge Detector
Whiteboard - intuition (only) for the Fourier decomposition
A very imprecise way of talking about what spatial frequencies are present in an image.
beans = imageio.imread("../data/beans.jpg").astype(np.float32) / 255.0
bg = skim.color.rgb2gray(beans) # grayscale beans
plt.imshow(beans)
<matplotlib.image.AxesImage at 0x7f0f7cac0c10>
plt.imshow(beans[400:500, 100:200])
<matplotlib.image.AxesImage at 0x7f0f804b1750>
plt.imshow(beans[20:120, 500:600])
<matplotlib.image.AxesImage at 0x7f0f7c1ade50>
plt.imshow(beans[420:480, 550:600])
<matplotlib.image.AxesImage at 0x7f0f7c204310>
Low-pass: allows low frequencies to pass through unaffected, i.e., attenuates high frequencies.
High-pass: allows high frequencies to pass through unaffected, i.e., attenuates low frequencies.
Question that didn't make it onto the homework: in what sense is Sobel not truly a high-pass filter?
(3) Using the language of "low-" and "high-frequency" image content, explain why sharpening is not the inverse of blurring, and what it accomplishes instead.
(4) Consider the original image of beans on the left, and the processed version on the right. Describe what has changed in terms of frequency content.
(5) What's the maximum frequency (expressed in full periods per pixel) representable in a 1D image (i.e., a row of pixels)? What does such an image look like?
(6) What's the minimum frequency representable in a 1D image? What does such an image look like?
My image is too big to fit on my screen. For example, suppose beans is 600x600, but I want to display the image in 300x300 pixels. What should I do?
bg.shape # beans grayscale
(600, 600)
util.imshow_truesize(bg)
util.imshow_truesize(bg[::2,::2])
bricks = imageio.imread("../data/bricks.jpg").astype(np.float32) / 255.0
plt.imshow(bricks[::4,::4,:])
<matplotlib.image.AxesImage at 0x7f0f7c0d6890>
checker = np.zeros((1, 17))
checker[:, ::2] = 1.0
util.imshow_gray(checker)
plt.imshow(checker[:, 1::2], vmin=0, vmax=1, cmap="gray") # force color scale to [0,1] range
<matplotlib.image.AxesImage at 0x7f0f75f75410>
If you walked far away from the above image until you couldn't distinguish individual pixels, what would it look like?
Whiteboard: downsampling freqeuncyometer
# todo: implement filtering.down_2x
util.imshow_truesize(bg[::4,::4])
# todo: implement down_2x
util.imshow_truesize(filtering.down_4x(bg))
My image is too small for my screen. For example, suppose beans is 150x150, but I want to display the image in 600x600 pixels. What should I do?
beans150 = filtering.down_4x(bg)
util.imshow_truesize(beans150)
See naive version preimplemented in filtering.up_2x
util.imshow_truesize(filtering.up_2x(beans150, interp="none"))
Whiteboard: Filtering view of upsampling
# todo: implement reconstruction filtering version in up_2x
# - Gaussian reconstruction filter
# - Linear reconstruction filter
util.imshow_truesize(filtering.up_4x(beans150, interp="gaussian"))
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x7f0f75af6790>