Harris Corner Detector | VitaVision
← Back to algorithms

Harris Corner Detector

Vitaly Vorobyev··4 min read
intermediate
computer-visionfeature-detectioncorner

Goal

Detect corners in a grayscale image I:ΩRI: \Omega \to \mathbb{R} on pixel domain ΩZ2\Omega \subset \mathbb{Z}^2. Output: a set of integer pixel locations {(xi,yi)}Ω\{(x_i, y_i)\} \subset \Omega at which intensity varies strongly in two independent directions. The detector responds to local image structure where both principal curvatures of the local autocorrelation function are large — flat regions and edges are suppressed.

Algorithm

Let Ix,Iy:ΩRI_x, I_y: \Omega \to \mathbb{R} denote the image derivatives in the horizontal and vertical directions, approximated by central differences or Sobel operators. Let w(u,v)w(u, v) denote a Gaussian window with standard deviation σ\sigma:

w(u,v)=exp ⁣(u2+v22σ2).w(u, v) = \exp\!\left(-\frac{u^2 + v^2}{2\sigma^2}\right).

Let α,β\alpha, \beta denote the eigenvalues of MM at a given pixel.

Definition
Structure tensor (M)

The 2×22 \times 2 symmetric matrix summarising gradient covariance in a local neighbourhood.

M=(u,v)w(u,v)[Ix2IxIyIxIyIy2],M = \sum_{(u,v)} w(u,v) \begin{bmatrix} I_x^2 & I_x I_y \\ I_x I_y & I_y^2 \end{bmatrix},

with entries A=Ix2wA = I_x^2 \otimes w, B=Iy2wB = I_y^2 \otimes w, C=(IxIy)wC = (I_x I_y) \otimes w, so that tr(M)=A+B=α+β\operatorname{tr}(M) = A + B = \alpha + \beta and det(M)=ABC2=αβ\det(M) = AB - C^2 = \alpha\beta.

Definition
Harris response (R)

A rotation-invariant corner score expressed in terms of tr(M)\operatorname{tr}(M) and det(M)\det(M) to avoid explicit eigendecomposition.

R=det(M)ktr(M)2,R = \det(M) - k \cdot \operatorname{tr}(M)^2,

where kk is an empirical sensitivity constant, typically in the range 0.040.040.060.06. R>0R > 0 in corner regions, R<0R < 0 on edges, and R|R| small in flat regions.

Procedure

Algorithm
Harris corner detection
Input: Grayscale image II on domain Ω\Omega; Gaussian window parameter σ\sigma; sensitivity constant kk; threshold τ\tau.
Output: Set of integer pixel locations {(xi,yi)}\{(x_i, y_i)\} marking detected corners.
  1. Compute IxI_x and IyI_y across the image via central differences or Sobel operators.
  2. Form the per-pixel products Ix2I_x^2, Iy2I_y^2, and IxIyI_x I_y.
  3. Convolve each product with the Gaussian window ww to obtain the entries AA, BB, CC of MM at every pixel.
  4. Compute R(x,y)=(ABC2)k(A+B)2R(x, y) = (A \cdot B - C^2) - k \cdot (A + B)^2 at every pixel.
  5. Discard pixels with R(x,y)τR(x, y) \leq \tau.
  6. Apply non-maximum suppression: retain only pixels whose RR value is an 8-way local maximum.

Implementation

The per-pixel response computation in Rust:

fn harris_response(img: &[f32], w: usize, h: usize, sigma: f32, k: f32) -> Vec<f32> {
    // Step 1: gradients via central differences
    let (ix, iy) = gradients(img, w, h);
    // Step 2: per-pixel products
    let ixx: Vec<f32> = ix.iter().zip(&ix).map(|(a, b)| a * b).collect();
    let iyy: Vec<f32> = iy.iter().zip(&iy).map(|(a, b)| a * b).collect();
    let ixy: Vec<f32> = ix.iter().zip(&iy).map(|(a, b)| a * b).collect();
    // Step 3: Gaussian-blur each component to obtain M's entries A, B, C
    let a = gaussian_blur(&ixx, w, h, sigma);
    let b = gaussian_blur(&iyy, w, h, sigma);
    let c = gaussian_blur(&ixy, w, h, sigma);
    // Step 4: R = det(M) - k * tr(M)^2 per pixel
    a.iter().zip(&b).zip(&c)
        .map(|((&aa, &bb), &cc)| (aa * bb - cc * cc) - k * (aa + bb).powi(2))
        .collect()
}

gradients and gaussian_blur are standard helpers; gradients returns central-difference derivatives and gaussian_blur applies a separable Gaussian kernel. Each line of the kernel corresponds directly to a step in the Procedure above.

Remarks

  • Complexity: O(Ω)O(|\Omega|) per scale — gradient computation, three convolutions (one per product Ix2I_x^2, Iy2I_y^2, IxIyI_x I_y), and a fixed scalar arithmetic pass per pixel. Separable Gaussian convolutions vectorize over rows.
  • The sensitivity constant kk is typically in 0.04–0.06. The response is a continuous function of kk; smaller values suppress edge responses less aggressively.
  • The response is rotation-invariant by construction: tr(M)\operatorname{tr}(M) and det(M)\det(M) are both rotation-invariant quantities.
  • The detector is not scale-invariant. Response peaks at the scale of the integration window σ\sigma. Multi-scale detection requires running the detector on a Gaussian image pyramid.
  • The detector responds to general 2D intensity variation; it does not encode X-junction geometry. For chessboard calibration targets, a domain-specific detector yields higher selectivity.

References

  1. C. Harris, M. J. Stephens. A Combined Corner and Edge Detector. Alvey Vision Conference, 1988. DOI: 10.5244/c.2.23
  2. J. Shi, C. Tomasi. Good Features to Track. IEEE CVPR, 1994. DOI: 10.1109/CVPR.1994.323794

Related Algorithms