Epipolar Geometry | VitaVision
Back to atlas

Epipolar Geometry

9 min readAdvancedView in graph

Definition

Epipolar geometry describes the geometric relationship between two images of the same scene taken from different camera centers CC and CC'. For any point PP in the scene, its projections x\mathbf{x} and x\mathbf{x}' into the two images, together with the two camera centers, are coplanar. This plane is called the epipolar plane of PP. The epipolar plane intersects image 1 along the epipolar line l\mathbf{l} corresponding to x\mathbf{x}', and image 2 along the epipolar line l\mathbf{l}' corresponding to x\mathbf{x}.

The epipolar constraint states: given a point x\mathbf{x} in image 1, its corresponding point x\mathbf{x}' in image 2 must lie on the epipolar line l\mathbf{l}' in image 2. This reduces stereo correspondence search from 2-D to 1-D.

The epipoles e\mathbf{e} and e\mathbf{e}' are the projections of each camera center into the opposite image: e\mathbf{e} is the projection of CC' into image 1; e\mathbf{e}' is the projection of CC into image 2. All epipolar lines in image 1 pass through e\mathbf{e}; all epipolar lines in image 2 pass through e\mathbf{e}'.

Mathematical Description

Fundamental matrix

Definition
Fundamental matrix

The unique 3×33 \times 3 rank-2 matrix FF encoding the epipolar constraint between two uncalibrated cameras.

For any pair of corresponding points xx\mathbf{x} \leftrightarrow \mathbf{x}' (in pixel coordinates), FF satisfies:

xTFx=0.\mathbf{x}'^T F\,\mathbf{x} = 0.

The epipolar line in image 2 corresponding to x\mathbf{x} is l=Fx\mathbf{l}' = F\,\mathbf{x}; the epipolar line in image 1 corresponding to x\mathbf{x}' is l=FTx\mathbf{l} = F^T\mathbf{x}'.

Properties of FF:

  • Rank 2 (not full rank; detF=0\det F = 0).
  • 7 degrees of freedom (9 entries, 1-1 for scale, 1-1 for the rank constraint).
  • Skew-symmetric part is related to the epipole: Fe=0F\,\mathbf{e} = \mathbf{0} and FTe=0F^T\mathbf{e}' = \mathbf{0}.

Essential matrix

When the camera intrinsic matrices KK and KK' are known, the epipolar constraint can be expressed in calibrated (normalized) coordinates x^=K1x\hat{\mathbf{x}} = K^{-1}\mathbf{x} and x^=K1x\hat{\mathbf{x}}' = K'^{-1}\mathbf{x}':

x^TEx^=0,\hat{\mathbf{x}}'^T E\,\hat{\mathbf{x}} = 0,

where EE is the essential matrix.

Definition
Essential matrix

The fundamental matrix in calibrated coordinates; encodes only the relative rotation RR and translation tt between the two cameras.

E=[t]×R,E = [t]_\times R,

where [t]×[t]_\times is the skew-symmetric matrix of tt.

Relationship between EE and FF:

E=KTFK.E = K'^T F K.

Properties of EE:

  • Rank 2, with two equal non-zero singular values (this is a necessary and sufficient condition for a 3×33\times 3 matrix to be an essential matrix).
  • 5 degrees of freedom: 3 for rotation RSO(3)R \in SO(3), 2 for the direction of tt (translation is determined only up to scale from image correspondences alone).

Decomposing EE into RR and tt

Given EE, compute its SVD: E=Udiag(σ,σ,0)VTE = U\,\mathrm{diag}(\sigma, \sigma, 0)\,V^T. The four candidate decompositions are:

(R,t){(UWVT,u3),(UWVT,u3),(UWTVT,u3),(UWTVT,u3)},(R, t) \in \{(UWV^T,\, u_3),\,(UWV^T,\,-u_3),\,(UW^TV^T,\,u_3),\,(UW^TV^T,\,-u_3)\},

where W=[010100001]W = \begin{bmatrix}0&-1&0\\1&0&0\\0&0&1\end{bmatrix} and u3u_3 is the third column of UU. A single triangulated 3-D point with positive depth in both cameras disambiguates the four solutions.

Estimation

8-point algorithm. Each correspondence xixi\mathbf{x}_i \leftrightarrow \mathbf{x}'_i yields one linear equation in the 9 entries of FF (or EE). Stacking n8n \geq 8 equations gives the design matrix AA; the solution is the smallest right singular vector of AA. The rank-2 constraint is enforced by zeroing the smallest singular value of the initial estimate.

Normalized 8-point algorithm. Hartley normalization (translate to zero mean, scale to 2\sqrt{2} RMS distance) is applied to both point sets before forming AA. This conditions the design matrix and produces numerically accurate results even from hundreds of noisy correspondences.

5-point algorithm. For the essential matrix specifically, 5 correspondences suffice (since EE has 5 degrees of freedom). The 5-point solver of Nistér (2004) finds the set of all valid EE consistent with 5 correspondences by solving a polynomial system (up to 10 real solutions), selecting the physically valid one via a depth-positivity test. It is the minimal solver used inside RANSAC for calibrated stereo and structure-from-motion.

Numerical Concerns

Rank-2 enforcement. DLT minimizes an algebraic error and does not constrain FF to rank 2. The rank-2 projection is applied after DLT by setting the smallest singular value of the 3×33\times 3 DLT solution to zero. This projection is optimal in the Frobenius norm but not in the geometric (Sampson) error; iterative refinement of the rank-constrained FF using the geometric error is preferred for accuracy.

Hartley normalization. Without normalization the design matrix AA for the 8-point algorithm has a large condition number (proportional to the image coordinate range squared), producing solutions that are sensitive to noise. Normalization is not optional for the fundamental matrix; it is required for the algorithm to work reliably.

Gauge ambiguity. The fundamental matrix is defined only up to scale. Setting FF=1\|F\|_F = 1 is standard. For the essential matrix, the two equal singular values are conventionally set to 1 after enforcing the rank constraint; this fixes the scale of EE relative to the image coordinate system but not the metric scale of tt.

Degenerate configurations. The fundamental matrix is undefined when all scene points are coplanar (the epipolar constraint degenerates to a planar homography). This case is detected when the design matrix AA is rank-deficient or has two near-equal small singular values. Structure-from-motion pipelines use a homography-vs-fundamental-matrix test (comparing inlier counts under both models) to detect planar scenes and select the appropriate initialization.

Epipole near or in the image. When the second camera is looking at the first camera (forward motion), e\mathbf{e}' lies inside or near the image. Algorithms that parameterize FF via the epipole (e.g., 7-point solver) become poorly conditioned. The Sampson correction to the algebraic error is well-behaved even in this case.

Translation-only degeneracy. Pure translation with no rotation gives E=[t]×E = [t]_\times, which is symmetric and has a unique right null vector tt. The 5-point solver does not degenerate in this case, but the 8-point solver can yield inaccurate rotation estimates because the off-diagonal terms are dominated by the skew-symmetric part.

Where it appears

Epipolar geometry is the foundational constraint for any algorithm that uses two or more camera views. It reduces stereo matching from a 2-D search to a 1-D search along epipolar lines, and it is the core relation recovered in the first step of structure-from-motion.

No algorithm pages on this site currently cover stereo reconstruction, triangulation, or multi-view structure-from-motion — the domains where epipolar geometry is used directly. The concept is documented here because it is a prerequisite for those topics and because the homography (documented separately) is the degenerate case of epipolar geometry for planar scenes.

References

  1. R. Hartley, A. Zisserman. Multiple View Geometry in Computer Vision. 2nd ed. Cambridge University Press, 2004. Chapters 9–11 give the definitive treatment of the fundamental matrix, essential matrix, and estimation algorithms.
  2. H. C. Longuet-Higgins. "A Computer Algorithm for Reconstructing a Scene from Two Projections." Nature 293, 1981. Introduces the essential matrix and the 8-point algorithm for calibrated cameras.
  3. R. Hartley. "In Defense of the Eight-Point Algorithm." IEEE TPAMI 19(6), 1997. Establishes data normalization as the key fix that makes the 8-point algorithm reliable.
  4. D. Nistér. "An Efficient Solution to the Five-Point Relative Pose Problem." IEEE TPAMI 26(6), 2004. The 5-point minimal solver; standard in RANSAC-based pose estimation.
  5. R. Szeliski. Computer Vision: Algorithms and Applications. 2nd ed. Springer, 2022. §11.1–11.3 cover stereo and two-view geometry with implementation notes.