Goal
Given correspondences between known 3D world points and observed image points in pixels, recover the intrinsic and extrinsic camera parameters of a perspective camera in which the image sensor is not perpendicular to the optic axis. The parameter set is : pixel coordinates of the centre of radial distortion, pixel sizes , Euler angles specifying the lens–sensor tilt rotation , Euler angles specifying the world-to-lens rotation , translation , and camera constant . Tsai's radial alignment constraint (RAC) assumes ; the generalized form relaxes that assumption and solves for as part of the calibration.
Algorithm
Let be a world point. Write its representation in the lens coordinate system as with and parametrised by Euler angles . Write its observed image coordinates in pixels as and convert to sensor-frame metric coordinates as
so is the non-frontal sensor-frame point.
Rotation that takes the lens coordinate frame to the sensor coordinate frame. The lens is rotationally symmetric about its optic axis, so rotation about is redundant; only two Euler angles are identifiable.
Projection of through the lens centre onto a hypothesized sensor plane normal to the optic axis at distance .
where is the entry of .
Let be the centre of the frontal sensor on the optic axis, and the foot of the perpendicular from onto the optic axis. The vectors and are perpendicular to the same line (the optic axis) and lie in a common plane. Radial alignment requires them to be parallel.
Cross-product form of the parallelism condition , expressed in lens-frame coordinates.
Substituting the frontal projection (above) and into the gRAC and eliminating the common scale reduces the per-correspondence equation to a linear form in seven unknown combinations of the calibration parameters:
Each world–image correspondence contributes one row to the linear system. The unknowns absorb the extrinsic rotation , the tilt , and the first two translation components .
The system has rank seven, so four or more correspondences determine by linear least squares. Recovery of from proceeds algebraically but with a four-way sign ambiguity:
with sign fixed later. Setting and , the two tilt ratios are
The first row of is (for ). The second row and follow from
The third row of is . The tilt Euler angles follow from via
The individual signs of and remain free; their relative sign equals . Combined with , this leaves four candidate parameter sets.
The remaining parameters and are recovered under a no-distortion assumption. With , , and , each correspondence gives
Stacking this equation over correspondences yields for each of the four candidates. Two are rejected by ; the remaining pair is disambiguated by which one admits a better fit to a symmetric radial-distortion model , where is the ideal distortion-free projection of the world point onto the frontal sensor and .
- Convert pixel observations to sensor-frame metric points via , .
- Assemble the system and solve by linear least squares.
- From compute , , , , and reconstruct , , and up to the four-way sign ambiguity.
- For each of the four candidates , stack Eq. 35 over correspondences and solve for by linear least squares. Reject candidates with .
- Fit a symmetric radial-distortion model to the frontal-projected points for each remaining candidate and pick the one with the smaller residual.
- Refine by iterative image-plane sampling: for each sampled location, re-run steps 1–5 and evaluate the residual RAC error on the resulting frontal coordinates; move the sampling window around the minimiser until convergence.
- Use from step 6 to initialise a nonlinear refinement that minimises reprojection error including radial distortion .
Implementation
The row assembly for the linear form and the stage-1 decomposition into in Rust. The remaining steps (four-way disambiguation, Eq. 35 linear solve, CoD iteration) are small wrappers around these two kernels.
use nalgebra::{Matrix3, Vector3};
fn grac_row(x_nf: f64, y_nf: f64, w: &Vector3<f64>) -> ([f64; 7], f64) {
let (xw, yw, zw) = (w.x, w.y, w.z);
([x_nf*xw, x_nf*yw, x_nf*zw, x_nf,
y_nf*xw, y_nf*yw, y_nf*zw], y_nf)
}
fn decompose_q(q: &[f64; 7]) -> (Matrix3<f64>, f64, f64, f64, f64) {
let tx = 1.0 / (q[4]*q[4] + q[5]*q[5] + q[6]*q[6]).sqrt();
let m = q[0]*q[4] + q[1]*q[5] + q[2]*q[6];
let n = q[0]*q[0] + q[1]*q[1] + q[2]*q[2];
let tx2 = tx * tx;
let l = tx2 * m;
let p = (n * tx2 - m * m * tx2 * tx2).max(0.0).sqrt();
let s1 = Vector3::new(-q[4]*tx, -q[5]*tx, -q[6]*tx);
let s2 = Vector3::new((q[0] - tx2*m*q[4]) * tx / p,
(q[1] - tx2*m*q[5]) * tx / p,
(q[2] - tx2*m*q[6]) * tx / p);
let s3 = s1.cross(&s2);
let s = Matrix3::from_rows(&[s1.transpose(), s2.transpose(), s3.transpose()]);
let ty = (q[3] + tx2 * m) * tx / p;
(s, tx, ty, l, p)
}
fn tilt_angles(l: f64, p: f64) -> (f64, f64) {
let k = l*l + p*p + 1.0;
let disc = (k*k - 4.0 * p*p).max(0.0).sqrt();
let cos_rho = ((k - disc) / (2.0 * p*p)).sqrt();
let rho = cos_rho.acos();
let sigma = (1.0 - p*p * cos_rho*cos_rho).max(0.0).sqrt().asin();
(rho, sigma)
}
Remarks
- Rank of is seven, so the minimum sample size is four correspondences; in practice ten to twenty well-distributed points are used to damp the four-way sign disambiguation and the subsequent CoD search.
- The 2-DoF tilt parametrisation is identifiable only when at least two correspondences have distinct world-depth ; coplanar targets leave and coupled in Eq. 35 and force the algorithm into multi-plane acquisition (the paper's 2.5D dataset moves the calibration board along its surface normal).
- The four-way ambiguity is inherent to the constraint, not an artifact of the solver: the projection from non-frontal sensor to frontal sensor is many-to-one in the tilt angles, and RAC fixes only the relative sign of through . Disambiguation via radial-distortion-model fit degrades when the true distortion is small.
- The CoD search is an outer loop around the linear solver: each candidate re-runs steps 1–5 and evaluates residual RAC error on the frontal-projected points. Cost is per iteration for sampled CoD locations and correspondences; convergence depends on the seed accuracy.
- Reduces to Tsai's RAC when : the tilt-induced rows of the unknown collapse ( depend only on and ; are the unchanged Tsai quantities), and Eq. 35 becomes Tsai's linear system for .
- Compared with Tsai 1987: see When to choose Tsai over Kumar gRAC on the Tsai page, which hosts the comparison per the older-paper-hosts rule.
References
- A. Kumar and N. Ahuja. Generalized Radial Alignment Constraint for Camera Calibration. ICPR, 2014. pdf
- R. Y. Tsai. A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE Journal on Robotics and Automation 3(4)–344, 1987. pdf
- J. Weng, P. Cohen, and M. Herniou. Camera calibration with distortion models and accuracy evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence 14(10)–980, 1992. doi