Feedback for P01 and P02 came back yesterday. I added a few notes on logistics to the syllabus, just so they're written down somewhere:
Project 1 is out today. We'll talk about it a bit in class on Thursday.
L??_476.ipynb
and L??_576.ipynb
# boilerplate setup
%load_ext autoreload
%autoreload 2
%matplotlib inline
import os
import sys
src_path = os.path.abspath("../src")
if (src_path not in sys.path):
sys.path.insert(0, src_path)
# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2
# codebase imports
import util
import filtering
The autoreload extension is already loaded. To reload it, use: %reload_ext autoreload
Images are functions; differentiation is an operator. What does it mean?
Since they're functions of two variables, a single-valued output needs to be a partial derivative:
We have discrete (i.e., sampled) images, so we need to approximate this with finite differences. Let's design a convolution kernel that accomplishes this.
Whiteboard: calculus reminder
Consider the following two candidate horizontal derivative filters. $$ \begin{bmatrix} 1 & -1 & 0\\ \end{bmatrix} $$
$$ \begin{bmatrix} 1 & 0 & -1\\ \end{bmatrix} $$bikes = imageio.imread("../data/bikesgray.jpg").astype(np.float32) / 255.0
bikes = skim.transform.rescale(bikes, 0.5, anti_aliasing=True)
bikes = bikes + np.random.randn(*bikes.shape) * 0.05
util.imshow_gray(bikes)
dx = np.array([-1, 0, 1]).reshape((1, 3)) / 2
dy = dx.T
Look at filtering.py
filter
function to handle non-square kernels.convolve
that just flips the kernel then runs filter
bx = filtering.filter(bikes, dx)
by = filtering.filter(bikes, dy)
util.imshow_gray(np.hstack([bx, by]))
util.imshow_gray(np.hstack([bx[30:80, :50], by[30:80, :50]]))
Let's look at the intensities along a single scanline. This one is a vertical scanline that crosses the brick pattern.
plt.plot(bikes[:,100])
[<matplotlib.lines.Line2D at 0x7f5b9e963ed0>]
plt.plot(by[:,100])
[<matplotlib.lines.Line2D at 0x7f5b9e9d2110>]
This motivates an idea: blur the noise so that the real edges stick out!
Why use 2 filters when you could use just 1?
Compute the following convolution, which results in a new filter kernel, and describe the effect of this new kernel in words. $$ \begin{bmatrix} 1 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 1 \end{bmatrix} * \begin{bmatrix} 0 & 0 & 0\\ 1 & 0 & -1\\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} \ & \ & \ \\ \ & \ & \ \\ \hspace{1em} & \hspace{1em} & \hspace{1em} \end{bmatrix} $$
Let's check our work:
blur = np.array([
[1, 2, 1],
[2, 4, 2],
[1, 2, 1]], dtype=np.float32)
dx = np.array([
[0, 0, 0],
[1, 0, -1],
[0, 0, 0]], dtype=np.float32)
# check our answer
xsobel = filtering.convolve(blur, dx)
ysobel = xsobel.T
xsobel
array([[ 2., 0., -2.], [ 4., 0., -4.], [ 2., 0., -2.]], dtype=float32)
For whatever reason, this is more often written scaled down by 1/2:
xsobel /= 2
ysobel = xsobel.T
xsobel, ysobel
(array([[ 1., 0., -1.], [ 2., 0., -2.], [ 1., 0., -1.]], dtype=float32), array([[ 1., 2., 1.], [ 0., 0., 0.], [-1., -2., -1.]], dtype=float32))
bx = filtering.convolve(bikes, xsobel)
by = filtering.convolve(bikes, ysobel)
util.imshow_gray(np.hstack([bx, by]))
plt.plot(by[:,100])
[<matplotlib.lines.Line2D at 0x7f5b86313ed0>]
Direction-independent edge detector? First pass: gradient magnitude
$$ \Delta f = \begin{bmatrix} \frac{\partial}{\partial x} f \\ \frac{\partial}{\partial y} f \end{bmatrix} $$Edge strength: gradient magnitude $||\Delta f||$
plt.imshow(np.sqrt(bx ** 2 + by**2), cmap="gray")
<matplotlib.image.AxesImage at 0x7f5b8619b6d0>
This is useful enough that I wrote filtering.grad_mag
to make it simple to do.
util.imshow_gray(filtering.grad_mag(bikes))
Classical fancier method: Canny Edge Detector
Whiteboard - intuition (only) for the Fourier decomposition
A very imprecise way of talking about what spatial frequencies are present in an image.
beans = imageio.imread("../data/beans.jpg").astype(np.float32) / 255.0
bg = skim.color.rgb2gray(beans) # grayscale beans
plt.imshow(beans)
<matplotlib.image.AxesImage at 0x7f5b86311990>
plt.imshow(beans[400:500, 100:200])
<matplotlib.image.AxesImage at 0x7f5b862f18d0>
plt.imshow(beans[20:120, 500:600])
<matplotlib.image.AxesImage at 0x7f5b9e901850>
plt.imshow(beans[420:480, 550:600])
<matplotlib.image.AxesImage at 0x7f5b85876250>
Low-pass: allows low frequencies to pass through unaffected, i.e., attenuates high frequencies.
High-pass: allows high frequencies to pass through unaffected, i.e., attenuates low frequencies.
Question that didn't make it onto the homework: in what sense is Sobel not truly a high-pass filter?
(3) Using the language of "low-" and "high-frequency" image content, explain why sharpening is not the inverse of blurring, and what it accomplishes instead.
(4) Consider the original image of beans on the left, and the processed version on the right. Describe what has changed in terms of frequency content.
(5) What's the maximum frequency (expressed in full periods per pixel) representable in a 1D image (i.e., a row of pixels)? What does such an image look like?
(6) What's the minimum frequency representable in a 1D image? What does such an image look like?
My image is too big to fit on my screen. For example, suppose beans is 600x600, but I want to display the image in 300x300 pixels. What should I do?
bg.shape # beans grayscale
(600, 600)
util.imshow_truesize(bg)
util.imshow_truesize(bg[::2,::2])
bricks = imageio.imread("../data/bricks.jpg").astype(np.float32) / 255.0
plt.imshow(bricks[::4,::4,:])
<matplotlib.image.AxesImage at 0x7f5b85274910>
checker = np.zeros((1, 16))
checker[:, ::2] = 1.0
util.imshow_gray(checker)
plt.imshow(checker[:, 1::2], vmin=0, vmax=1, cmap="gray") # force color scale to [0,1] range
<matplotlib.image.AxesImage at 0x7f5b854218d0>
If you walked far away from the above image until you couldn't distinguish individual pixels, what would it look like?
Whiteboard: downsampling freqeuncyometer
# todo: implement filtering.down_2x
util.imshow_truesize(bg[::4,::4])
# todo: implement down_2x
util.imshow_truesize(filtering.down_4x(bg))
My image is too small for my screen. For example, suppose beans is 150x150, but I want to display the image in 600x600 pixels. What should I do?
beans150 = filtering.down_4x(bg)
util.imshow_truesize(beans150)
See naive version preimplemented in filtering.up_2x
util.imshow_truesize(filtering.up_4x(beans150, interp="nn"))
Whiteboard: Filtering view of upsampling
# todo: implement reconstruction filtering version in up_2x
# - Gaussian reconstruction filter
# - Linear reconstruction filter
util.imshow_truesize(filtering.up_4x(beans150, interp="linear"))