Canny Edge Detector | VitaVision
Back to atlas

Canny Edge Detector

7 min readIntermediateView in graph
Based on
A Computational Approach to Edge Detection
Canny · IEEE Transactions on Pattern Analysis and Machine Intelligence 1986
DOI ↗

Goal

Detect step edges in a greyscale image I:ΩRI: \Omega \to \mathbb{R} by finding pixels that are local maxima of the smoothed gradient magnitude (GσI)|\nabla(G_\sigma * I)| along the gradient direction, then linking them into connected contours via hysteresis thresholding. Input: a scalar greyscale image II and a smoothing scale σ\sigma. Output: a thin binary edge map in which each marked pixel is one pixel wide and carries an edge-strength value equal to (GσI)|\nabla(G_\sigma * I)| at that location. The filter shape is derived as the variational optimum of three criteria — detection signal-to-noise ratio (Σ\Sigma), localisation precision (Λ\Lambda), and single-response spacing (xmaxx_\text{max}) — for a step-edge profile under additive white Gaussian noise; the solution is unique up to spatial scaling.

Algorithm

Let I(x,y)I(x,y) denote the scalar greyscale image. Let Gσ(x)=exp(x2/(2σ2))G_\sigma(x) = \exp(-x^2/(2\sigma^2)) denote the 1D Gaussian with scale σ\sigma. Let f(x)f(x) denote the impulse response of a candidate linear detection filter of finite support [W,+W][-W, +W]. Let n02n_0^2 denote the mean-squared noise amplitude per unit length. Let G(x)=Au1(x)G(x) = A\,u_{-1}(x) denote the step-edge signal model, where u1u_{-1} is the unit step (Eq. 16).

Definition
Detection criterion (\Sigma)

Signal-to-noise ratio of the filter response at the step-edge centre (Eq. 3):

Σ(f)=W+WG(x)f(x)dxn0(W+Wf2(x)dx)1/2.\Sigma(f) = \frac{\displaystyle\left|\int_{-W}^{+W} G(-x)\,f(x)\,dx\right|}{n_0\,\displaystyle\left(\int_{-W}^{+W} f^2(x)\,dx\right)^{1/2}}.
Definition
Localisation criterion (\Lambda)

Reciprocal of the root-mean-squared distance between the detected and true edge position (Eq. 9):

Λ(f)=W+WG(x)f(x)dxn0(W+Wf2(x)dx)1/2.\Lambda(f) = \frac{\displaystyle\left|\int_{-W}^{+W} G'(-x)\,f'(x)\,dx\right|}{n_0\,\displaystyle\left(\int_{-W}^{+W} f'^2(x)\,dx\right)^{1/2}}.
Definition
Single-response constraint (x_\text{max})

Mean distance between adjacent noise response maxima, constrained to prevent multiple detections per edge (Eq. 13):

xmax(f)=2xzc(f)=kW,x_\text{max}(f) = 2\,x_\text{zc}(f) = kW,

where kk is a fixed fraction of the operator support and the expected number of noise maxima per support width is Nn=2W/xmax=2/kN_n = 2W/x_\text{max} = 2/k (Eq. 14). The parameter kk is chosen so that the probability of a multiple response pmp_m equals the probability of a false detection pfp_f (Eq. 40).

The composite criterion is the product ΣΛ\Sigma \cdot \Lambda (Eq. 10), optimised subject to the constraint Eq. 13. The scale-invariance result (Eq. 21) states that for a spatially scaled filter fw(x)=f(x/w)f_w(x) = f(x/w):

Σ(fw)=wΣ(f),Λ(fw)=1wΛ(f),\Sigma(f_w) = w\,\Sigma(f), \qquad \Lambda(f_w) = \tfrac{1}{w}\,\Lambda(f),

so the product ΣΛ\Sigma \cdot \Lambda is invariant under spatial scaling. The design problem therefore has a single solution shape.

Among linear filters satisfying Eq. 13, the variational optimum for a step-edge signal is unique up to spatial scaling. The constrained optimum achieves r0.576r \leq 0.576 (filter 6, Fig. 4). The first derivative of a Gaussian

G(x)=xσ2exp ⁣(x22σ2)(Eq. 42)G'(x) = -\frac{x}{\sigma^2}\,\exp\!\left(-\frac{x^2}{2\sigma^2}\right) \quad \text{(Eq. 42)}

is the efficient approximation. Its performance terms are (Eq. 43):

0fdx=1,+f2dx=12σ,+f2dx=14σ3,+f2dx=18σ5.\int_0^\infty f\,dx = 1, \quad \int_{-\infty}^{+\infty} f^2\,dx = \frac{1}{\sqrt{2}\,\sigma}, \quad \int_{-\infty}^{+\infty} f'^2\,dx = \frac{1}{4\sigma^3}, \quad \int_{-\infty}^{+\infty} f''^2\,dx = \frac{1}{8\sigma^5}.

These yield the composite criterion ΣΛ=0.92/(3σ)\Sigma\Lambda = 0.92/(3\sigma) (Eq. 44) and multiple-response value r0.51r \approx 0.51 (Eq. 45) — approximately 20% below the optimal ΣΛ\Sigma\Lambda and 10% below the constrained-optimal r0.576r \approx 0.576.

Procedure

Algorithm
Canny edge detection
Input: Greyscale image II on domain Ω\Omega; smoothing scale σ\sigma; low threshold TlT_l; high threshold ThT_h.
Output: Thin binary edge map; each marked pixel is a local gradient-magnitude maximum along the gradient direction and is connected to at least one above-ThT_h seed.
  1. Smooth. Compute S=GσIS = G_\sigma * I via two separable 1D Gaussian convolutions — one along rows, one along columns.
  2. Gradient. Compute Sx=S/xS_x = \partial S / \partial x and Sy=S/yS_y = \partial S / \partial y via finite differences. Magnitude: S=Sx2+Sy2|\nabla S| = \sqrt{S_x^2 + S_y^2}. Unit normal: n^=(Sx,Sy)/S\hat{n} = (S_x, S_y)/|\nabla S| (Eq. 46).
  3. Non-maximum suppression (NMS). Retain pixel (x,y)(x,y) only if S(x,y)|\nabla S|(x,y) is greater than or equal to the gradient magnitude at both sub-pixel neighbours along n^\hat{n}, i.e. at (x,y)±n^(x,y) \pm \hat{n} (Eq. 47). Suppressed pixels are set to zero. This condition is equivalent to a zero-crossing of the directional second derivative 2S/n^2\partial^2 S / \partial \hat{n}^2.
  4. Hysteresis double-threshold. Pixels surviving NMS with STh|\nabla S| \geq T_h are accepted as edge seeds. A BFS flood-fill (8-connectivity) propagates each seed through neighbouring NMS-survivors with STl|\nabla S| \geq T_l. Pixels in [Tl,Th)[T_l, T_h) not reachable from any seed are discarded. The recommended ratio is Th/Tl[2,3]T_h/T_l \in [2, 3] (§VI).
flowchart LR
    I["I(x,y)"] --> S["Gaussian<br/>smooth (σ)"]
    S --> G["gradient<br/>|∇S|, n̂"]
    G --> N["NMS along n̂"]
    N --> H["hysteresis<br/>T_l ≤ … ≤ T_h"]
    H --> E["edge map"]

Multi-scale and directional extensions

The multi-scale feature-synthesis scheme (§VIII) runs the detector at multiple values of σ\sigma and suppresses large-σ\sigma responses already predicted by smaller-σ\sigma responses; most edge information is recovered at the smallest scale. The directional operators (§IX) replace the isotropic Gaussian with 6 elongated masks oriented at 30° intervals, using d/σ1.4d/\sigma \approx 1.4 sample spacing and 5 samples per mask; worst-case angle error is 15° and the minimum response fall-off is approximately 85% of the on-axis maximum.

Implementation

The four-stage pipeline in Rust:

fn canny(img: &[f32], w: usize, h: usize, sigma: f32, t_low: f32, t_high: f32) -> Vec<bool> {
    // Stage 1+2: smooth, then gradient magnitude + quantised direction (helpers hidden).
    let s = gaussian_blur(img, w, h, sigma);                  // two 1-D passes, O(N * sigma)
    let (gx, gy) = gradients(&s, w, h);                       // central differences
    let mag: Vec<f32> = gx.iter().zip(&gy).map(|(a, b)| a.hypot(*b)).collect();
    let bin: Vec<u8> = gx.iter().zip(&gy).map(|(a, b)| {
        let d = b.atan2(*a).to_degrees().rem_euclid(180.0);
        if d < 22.5 || d >= 157.5 { 0 } else if d < 67.5 { 1 }
        else if d < 112.5 { 2 } else { 3 }
    }).collect();

    // Stage 3: non-maximum suppression along the quantised gradient direction.
    let mut nms = vec![0f32; w * h];
    for y in 1..h - 1 { for x in 1..w - 1 {
        let i = y * w + x;
        let (p, q) = match bin[i] {
            0 => (mag[i + 1],     mag[i - 1]),                //   0°
            1 => (mag[i + 1 - w], mag[i - 1 + w]),            //  45°
            2 => (mag[i - w],     mag[i + w]),                //  90°
            _ => (mag[i - 1 - w], mag[i + 1 + w]),            // 135°
        };
        if mag[i] >= p && mag[i] >= q { nms[i] = mag[i]; }
    }}

    // Stage 4: hysteresis double-threshold via BFS from above-T_h seeds.
    let mut edge = vec![false; w * h];
    let mut queue = std::collections::VecDeque::new();
    for i in 0..w * h { if nms[i] >= t_high { edge[i] = true; queue.push_back(i); } }
    while let Some(i) = queue.pop_front() {
        let (x, y) = ((i % w) as i32, (i / w) as i32);
        for dy in -1i32..=1 { for dx in -1i32..=1 {
            if (dx | dy) == 0 { continue; }
            let (nx, ny) = (x + dx, y + dy);
            if nx < 0 || ny < 0 || nx >= w as i32 || ny >= h as i32 { continue; }
            let j = ny as usize * w + nx as usize;
            if !edge[j] && nms[j] >= t_low { edge[j] = true; queue.push_back(j); }
        }}
    }
    edge
}

The same procedure in Python using NumPy for the NMS stage:

import numpy as np
from collections import deque

def canny(img, sigma, t_low, t_high):
    s = gaussian_blur(img, sigma)
    gx, gy = np.gradient(s, axis=1), np.gradient(s, axis=0)
    mag = np.hypot(gx, gy)
    angle = np.degrees(np.arctan2(gy, gx)) % 180
    nms = np.zeros_like(mag)
    bins = [
        (((angle <  22.5) | (angle >= 157.5)), (1,  0)),  #   0°
        (((angle >= 22.5) & (angle <  67.5)),  (1,  1)),  #  45°
        (((angle >= 67.5) & (angle < 112.5)),  (0,  1)),  #  90°
        (((angle >=112.5) & (angle < 157.5)),  (1, -1)),  # 135°
    ]
    for mask, (dx, dy) in bins:
        p = np.roll(mag, (-dy, -dx), axis=(0, 1))
        q = np.roll(mag, ( dy,  dx), axis=(0, 1))
        keep = mask & (mag >= p) & (mag >= q)
        nms[keep] = mag[keep]
    edge = nms >= t_high
    q = deque(zip(*np.where(edge)))
    H, W = nms.shape
    while q:
        y, x = q.popleft()
        for ny, nx in [(y+dy, x+dx) for dy in (-1,0,1) for dx in (-1,0,1) if dy or dx]:
            if 0 <= ny < H and 0 <= nx < W and not edge[ny, nx] and nms[ny, nx] >= t_low:
                edge[ny, nx] = True; q.append((ny, nx))
    return edge

Remarks

  • The three criteria Σ\Sigma (Eq. 3), Λ\Lambda (Eq. 9), and xmaxx_\text{max} (Eq. 13) define a unique linear operator shape for the step-edge model. The first derivative of a Gaussian is the standard approximation; it achieves ΣΛ=0.92/(3σ)\Sigma\Lambda = 0.92/(3\sigma) and r0.51r \approx 0.51 — approximately 20% below the optimal ΣΛ\Sigma\Lambda — at a large computational saving from Gaussian separability.
  • Detection and localisation trade off via σ\sigma (Eq. 21): doubling σ\sigma doubles Σ\Sigma and halves Λ\Lambda. The product ΣΛ\Sigma\Lambda is scale-invariant; no choice of σ\sigma improves both simultaneously.
  • The locally-straight-edge precondition fails at corners and junctions. NMS along the gradient direction n^\hat{n} smears the response across several pixels at junctions; the exact location is not recovered.
  • Single-threshold marking causes streaking: a contour fluctuating near threshold is broken into segments. Hysteresis with Th/Tl[2,3]T_h/T_l \in [2, 3] (§VI) eliminates most false breaks; a contour must drop below TlT_l, not merely below ThT_h, to be interrupted.
  • The separable 2D Gaussian gives O(Nσ)O(N\sigma) smoothing cost for an NN-pixel image, where σ\sigma is the kernel radius in pixels (§VII).
  • A single scale σ\sigma either misses fine detail (large σ\sigma) or generates excessive false responses in textured regions (small σ\sigma). The multi-scale feature-synthesis scheme (§VIII) is noted in the paper's conclusions as an open problem; most edge information is recovered at the smallest channel.

References

  1. J. Canny. A Computational Approach to Edge Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6)
    –698, 1986. PDF