Goal
Detect step edges in a greyscale image by finding pixels that are local maxima of the smoothed gradient magnitude along the gradient direction, then linking them into connected contours via hysteresis thresholding. Input: a scalar greyscale image and a smoothing scale . Output: a thin binary edge map in which each marked pixel is one pixel wide and carries an edge-strength value equal to at that location. The filter shape is derived as the variational optimum of three criteria — detection signal-to-noise ratio (), localisation precision (), and single-response spacing () — for a step-edge profile under additive white Gaussian noise; the solution is unique up to spatial scaling.
Algorithm
Let denote the scalar greyscale image. Let denote the 1D Gaussian with scale . Let denote the impulse response of a candidate linear detection filter of finite support . Let denote the mean-squared noise amplitude per unit length. Let denote the step-edge signal model, where is the unit step (Eq. 16).
Signal-to-noise ratio of the filter response at the step-edge centre (Eq. 3):
Reciprocal of the root-mean-squared distance between the detected and true edge position (Eq. 9):
Mean distance between adjacent noise response maxima, constrained to prevent multiple detections per edge (Eq. 13):
where is a fixed fraction of the operator support and the expected number of noise maxima per support width is (Eq. 14). The parameter is chosen so that the probability of a multiple response equals the probability of a false detection (Eq. 40).
The composite criterion is the product (Eq. 10), optimised subject to the constraint Eq. 13. The scale-invariance result (Eq. 21) states that for a spatially scaled filter :
so the product is invariant under spatial scaling. The design problem therefore has a single solution shape.
Among linear filters satisfying Eq. 13, the variational optimum for a step-edge signal is unique up to spatial scaling. The constrained optimum achieves (filter 6, Fig. 4). The first derivative of a Gaussian
is the efficient approximation. Its performance terms are (Eq. 43):
These yield the composite criterion (Eq. 44) and multiple-response value (Eq. 45) — approximately 20% below the optimal and 10% below the constrained-optimal .
Procedure
- Smooth. Compute via two separable 1D Gaussian convolutions — one along rows, one along columns.
- Gradient. Compute and via finite differences. Magnitude: . Unit normal: (Eq. 46).
- Non-maximum suppression (NMS). Retain pixel only if is greater than or equal to the gradient magnitude at both sub-pixel neighbours along , i.e. at (Eq. 47). Suppressed pixels are set to zero. This condition is equivalent to a zero-crossing of the directional second derivative .
- Hysteresis double-threshold. Pixels surviving NMS with are accepted as edge seeds. A BFS flood-fill (8-connectivity) propagates each seed through neighbouring NMS-survivors with . Pixels in not reachable from any seed are discarded. The recommended ratio is (§VI).
flowchart LR
I["I(x,y)"] --> S["Gaussian<br/>smooth (σ)"]
S --> G["gradient<br/>|∇S|, n̂"]
G --> N["NMS along n̂"]
N --> H["hysteresis<br/>T_l ≤ … ≤ T_h"]
H --> E["edge map"]
Multi-scale and directional extensions
The multi-scale feature-synthesis scheme (§VIII) runs the detector at multiple values of and suppresses large- responses already predicted by smaller- responses; most edge information is recovered at the smallest scale. The directional operators (§IX) replace the isotropic Gaussian with 6 elongated masks oriented at 30° intervals, using sample spacing and 5 samples per mask; worst-case angle error is 15° and the minimum response fall-off is approximately 85% of the on-axis maximum.
Implementation
The four-stage pipeline in Rust:
fn canny(img: &[f32], w: usize, h: usize, sigma: f32, t_low: f32, t_high: f32) -> Vec<bool> {
// Stage 1+2: smooth, then gradient magnitude + quantised direction (helpers hidden).
let s = gaussian_blur(img, w, h, sigma); // two 1-D passes, O(N * sigma)
let (gx, gy) = gradients(&s, w, h); // central differences
let mag: Vec<f32> = gx.iter().zip(&gy).map(|(a, b)| a.hypot(*b)).collect();
let bin: Vec<u8> = gx.iter().zip(&gy).map(|(a, b)| {
let d = b.atan2(*a).to_degrees().rem_euclid(180.0);
if d < 22.5 || d >= 157.5 { 0 } else if d < 67.5 { 1 }
else if d < 112.5 { 2 } else { 3 }
}).collect();
// Stage 3: non-maximum suppression along the quantised gradient direction.
let mut nms = vec![0f32; w * h];
for y in 1..h - 1 { for x in 1..w - 1 {
let i = y * w + x;
let (p, q) = match bin[i] {
0 => (mag[i + 1], mag[i - 1]), // 0°
1 => (mag[i + 1 - w], mag[i - 1 + w]), // 45°
2 => (mag[i - w], mag[i + w]), // 90°
_ => (mag[i - 1 - w], mag[i + 1 + w]), // 135°
};
if mag[i] >= p && mag[i] >= q { nms[i] = mag[i]; }
}}
// Stage 4: hysteresis double-threshold via BFS from above-T_h seeds.
let mut edge = vec![false; w * h];
let mut queue = std::collections::VecDeque::new();
for i in 0..w * h { if nms[i] >= t_high { edge[i] = true; queue.push_back(i); } }
while let Some(i) = queue.pop_front() {
let (x, y) = ((i % w) as i32, (i / w) as i32);
for dy in -1i32..=1 { for dx in -1i32..=1 {
if (dx | dy) == 0 { continue; }
let (nx, ny) = (x + dx, y + dy);
if nx < 0 || ny < 0 || nx >= w as i32 || ny >= h as i32 { continue; }
let j = ny as usize * w + nx as usize;
if !edge[j] && nms[j] >= t_low { edge[j] = true; queue.push_back(j); }
}}
}
edge
}
The same procedure in Python using NumPy for the NMS stage:
import numpy as np
from collections import deque
def canny(img, sigma, t_low, t_high):
s = gaussian_blur(img, sigma)
gx, gy = np.gradient(s, axis=1), np.gradient(s, axis=0)
mag = np.hypot(gx, gy)
angle = np.degrees(np.arctan2(gy, gx)) % 180
nms = np.zeros_like(mag)
bins = [
(((angle < 22.5) | (angle >= 157.5)), (1, 0)), # 0°
(((angle >= 22.5) & (angle < 67.5)), (1, 1)), # 45°
(((angle >= 67.5) & (angle < 112.5)), (0, 1)), # 90°
(((angle >=112.5) & (angle < 157.5)), (1, -1)), # 135°
]
for mask, (dx, dy) in bins:
p = np.roll(mag, (-dy, -dx), axis=(0, 1))
q = np.roll(mag, ( dy, dx), axis=(0, 1))
keep = mask & (mag >= p) & (mag >= q)
nms[keep] = mag[keep]
edge = nms >= t_high
q = deque(zip(*np.where(edge)))
H, W = nms.shape
while q:
y, x = q.popleft()
for ny, nx in [(y+dy, x+dx) for dy in (-1,0,1) for dx in (-1,0,1) if dy or dx]:
if 0 <= ny < H and 0 <= nx < W and not edge[ny, nx] and nms[ny, nx] >= t_low:
edge[ny, nx] = True; q.append((ny, nx))
return edge
Remarks
- The three criteria (Eq. 3), (Eq. 9), and (Eq. 13) define a unique linear operator shape for the step-edge model. The first derivative of a Gaussian is the standard approximation; it achieves and — approximately 20% below the optimal — at a large computational saving from Gaussian separability.
- Detection and localisation trade off via (Eq. 21): doubling doubles and halves . The product is scale-invariant; no choice of improves both simultaneously.
- The locally-straight-edge precondition fails at corners and junctions. NMS along the gradient direction smears the response across several pixels at junctions; the exact location is not recovered.
- Single-threshold marking causes streaking: a contour fluctuating near threshold is broken into segments. Hysteresis with (§VI) eliminates most false breaks; a contour must drop below , not merely below , to be interrupted.
- The separable 2D Gaussian gives smoothing cost for an -pixel image, where is the kernel radius in pixels (§VII).
- A single scale either misses fine detail (large ) or generates excessive false responses in textured regions (small ). The multi-scale feature-synthesis scheme (§VIII) is noted in the paper's conclusions as an open problem; most edge information is recovered at the smallest channel.
References
- J. Canny. A Computational Approach to Edge Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6)–698, 1986. PDF