Definition
A homography is an invertible map between two projective planes . In the context of image formation, a homography maps every point in one image of a planar scene to the corresponding point in another image of the same scene, under the projective camera model. It is the unique map that is consistent with perspective projection from two camera centers onto a common world plane.
A homography is represented by a non-singular matrix acting on homogeneous coordinates: if is a point in image 1, its image in image 2 is
where denotes equality up to a non-zero scalar (projective equivalence). Because and represent the same map for any , a homography has degrees of freedom.
Mathematical Description
Projective geometry of the homography
A homography maps:
- Points (projective equivalence).
- Lines (covariant under the dual transformation).
- Conics .
Homographies preserve collinearity (points on a line map to points on a line) and the cross-ratio of four collinear points. They do not in general preserve angles, distances, or parallelism.
Special cases
Last row is ; the map is an affinity. Has 6 degrees of freedom.
Affinity with for rotation and scale . Has 4 degrees of freedom: , , , .
; the map is a rigid transform (rotation + translation). Has 3 degrees of freedom.
The hierarchy is: projective affinity similarity Euclidean.
Physical correspondences
A homography arises in four distinct physical situations:
- Planar scene: two cameras view the same planar surface; the projection onto each image is a homography of the plane.
- Pure rotation: a camera rotates about its optical center with no translation; all depth levels map consistently, giving a homography between the two image planes.
- Projective texture-mapping: the mapping from a reference texture atlas to a rendered image is a homography of the texture plane.
- Image rectification: the warp that remaps two perspective images into a common rectified plane is a homography applied to each image.
Estimation: DLT and normalized DLT
Given point correspondences , each correspondence yields two linear constraints on the -vector :
Stacking correspondences gives the design matrix ; the solution is the right singular vector of corresponding to the smallest singular value (Direct Linear Transform, DLT). The Hartley normalization conditions the design matrix before DLT: translate each point set to zero mean and scale isotropically so the mean distance from the origin is , reducing the condition number of by orders of magnitude.
Decomposition for calibration
When the camera intrinsic matrix is known, a homography between a world plane and the image can be decomposed into extrinsic parameters. Write where , are the first two columns of the rotation matrix and is the translation. From the columns and :
with . This decomposition is the core of Zhang's planar calibration method: observing the same planar target from views yields homographies, and the constraints and give two linear equations per homography in the entries of .
Numerical Concerns
Data normalization (Hartley normalization) is mandatory. Without normalizing the coordinate systems of both point sets before DLT, the condition number of is proportional to the squared image coordinate range (typically ), leading to large numerical errors in the recovered . The normalized DLT conditions to a condition number near 1 by applying the similarity transforms and and recovering .
Minimal configuration and degeneracy. Four point correspondences are the minimum for DLT; fewer than four yield an underdetermined system. Degenerate configurations exist even with four or more points: if three or more points are collinear in either image, or if all points lie on a conic, the homography is not uniquely determined. RANSAC sampling should avoid or detect collinear triplets.
Gauge fixing for nonlinear refinement. is defined only up to scale; fixing or removes the scale ambiguity. The first normalization is better behaved for iterative nonlinear least squares because it avoids the singularity when (which occurs for nearly-affine homographies). LM refinement should parameterize as a unit-norm 9-vector with the scale fixed at initialization.
Transfer error metric. The algebraic error is minimized by DLT but does not correspond to a geometric distance. The symmetric transfer error in pixel units is the geometrically meaningful cost for nonlinear refinement. The reprojection error using the full camera model is preferred when is known.
Near-affine homographies. When the scene plane is far from the camera and the depth variation is small, is nearly affine: and . DLT is stable in this regime, but the normalized DLT can over-condition by removing perspective structure that is genuinely present at the few-pixel level. Using the Sampson error instead of the full reprojection error is more numerically stable near the affine regime.
RANSAC sample size. A minimal sample of 4 correspondences determines uniquely (generically). RANSAC draws 4-point minimal samples and fits DLT to each; the inlier threshold is typically 1–3 pixels in transfer error. The number of RANSAC iterations required for 99% success probability with 50% inliers is approximately , giving .
Where it appears
The homography is the geometric primitive underlying planar calibration, marker board detection, and image stitching. Every time a calibration target is a flat object, the mapping from its surface to the image is a homography.
- zhang-planar-calibration — estimates one homography per view of a planar calibration target; stacks the two linear constraints per view to solve for the intrinsic matrix ; decomposes each homography into the extrinsic rotation and translation for that view.
- apap-image-stitching — the global DLT homography between two images is the baseline warp; APAP replaces it with a spatially-varying field of per-pixel homographies estimated by Moving DLT, each weighted by proximity to matched feature points.
References
- R. Hartley, A. Zisserman. Multiple View Geometry in Computer Vision. 2nd ed. Cambridge University Press, 2004. §§4–5 cover projective transformations; §13 covers homography estimation and its role in calibration.
- R. Hartley. "In Defense of the Eight-Point Algorithm." IEEE TPAMI 19(6), 1997. Establishes the Hartley normalization procedure as essential for numerically stable DLT.
- Z. Zhang. "A Flexible New Technique for Camera Calibration." IEEE TPAMI 22(11), 2000. §§2–3 derive the two linear constraints per homography on the image of the absolute conic.
- O. Faugeras. Three-Dimensional Computer Vision. MIT Press, 1993. Chapter 3 covers the algebraic and geometric properties of the homography group.