Goal
Encode a smoothed image patch around a previously detected keypoint as a -bit binary string by running a fixed table of pairwise pixel-intensity tests, then match between images by Hamming distance via bitwise XOR and popcount. Input: a greyscale image and a list of keypoint locations from any external detector. Output: a binary descriptor per keypoint, with . The algorithm is specific to the use of binary tests on a fixed integer offset table — there is no learned codebook, no orientation assignment, no scale-space construction; description and matching are integer-only after the smoothing stage.
Algorithm
Let denote the greyscale image. Let denote the Gaussian-smoothed patch on the keypoint-centred window of size . Let denote the patch side length in pixels. Let denote the standard deviation of the smoothing Gaussian and the discrete kernel window. Let denote the descriptor bit length. Let denote the -th integer pixel-pair offset within . Let denote a single pixel-pair test and the packed descriptor.
Single-bit comparison between two pixels of the smoothed patch. Equation 1 in the paper.
Packed concatenation of tests on a fixed offset table . Equation 2 in the paper.
Bit-disagreement count between two descriptors, evaluated as a bitwise XOR followed by a population count.
The smoothing parameters and produce stable recognition rates across the tested datasets; values of between 1 and 3 give similar results, while degrades sharply because individual pixel comparisons are noise-sensitive. The patch size pixels is the convention from the paper's experiments. The descriptor lengths are , named BRIEF-16, BRIEF-32, BRIEF-64 by storage in bytes.
Sampling distributions
The offset table is drawn once before any descriptors are computed and reused thereafter. Five distributions are evaluated; G II dominates the others empirically and is the recommended default.
- G I. i.i.d.
- G II. i.i.d., isotropic.
- G III. , (two-stage; second sample concentrated near the first).
- G IV. drawn at random from a coarse polar grid.
- G V. for all ; scans every position of a coarse polar grid of points.
- Smooth with a discrete Gaussian of width and scale (or replace by a box-filter approximation on an integral image).
- For each keypoint , extract the patch centred at from the smoothed image.
- For each , compute as a single integer pixel comparison.
- Pack the bits into 64-bit words to obtain .
- Match across images by computing with bitwise XOR plus a popcount instruction; nearest-neighbour search with a left-right consistency check is recommended for outlier removal.
Implementation
Per-keypoint test-and-pack in Rust on an already-smoothed greyscale image. The offset table is integer and shared across all calls; the descriptor is stored as 64-bit words so that matching reduces to word-wise XOR and count_ones().
const ND: usize = 256; // BRIEF-32: 256 bits = 4 u64 words
const WORDS: usize = ND / 64;
type Descriptor = [u64; WORDS];
// Fixed offset table, generated once by sampling from G II
// (i.i.d. Gaussian centred at the patch origin with sigma^2 = S^2/25).
struct Pair { x: (i32, i32), y: (i32, i32) }
type OffsetTable = [Pair; ND];
fn intensity(img: &[u8], w: usize, h: usize, kx: i32, ky: i32, dx: i32, dy: i32) -> u8 {
let x = (kx + dx).clamp(0, w as i32 - 1) as usize;
let y = (ky + dy).clamp(0, h as i32 - 1) as usize;
img[y * w + x]
}
fn brief(smoothed: &[u8], w: usize, h: usize, kx: i32, ky: i32, table: &OffsetTable) -> Descriptor {
let mut f: Descriptor = [0; WORDS];
for (i, pair) in table.iter().enumerate() {
let pa = intensity(smoothed, w, h, kx, ky, pair.x.0, pair.x.1);
let pb = intensity(smoothed, w, h, kx, ky, pair.y.0, pair.y.1);
if pa < pb {
f[i / 64] |= 1u64 << (i % 64);
}
}
f
}
fn hamming(a: &Descriptor, b: &Descriptor) -> u32 {
a.iter().zip(b.iter()).map(|(x, y)| (x ^ y).count_ones()).sum()
}
Remarks
- Per-keypoint description is integer pixel comparisons; matching is word-level XOR plus popcount. Smoothing dominates the total runtime — replacing the Gaussian with a box-filter approximation on an integral image is the practical speedup path.
- Storage is 16, 32, or 64 bytes per descriptor for , against 256 bytes for a 64-D float SURF descriptor.
- The descriptor is rotation-sensitive: recognition rate is approximately constant up to – in-plane rotation and drops steeply beyond, because no orientation normalisation is applied.
- Distribution G II (i.i.d. Gaussian, ) consistently outperforms the four alternatives across the paper's benchmark; the regular polar design G V is the worst.
- A constant offset table is required for any two descriptors to be comparable — resampling between query and database is a silent match failure.
- ORB extends BRIEF by steering the offset table to the keypoint orientation via a 30-bin angle lookup table ( resolution), then replaces the i.i.d. Gaussian sampling of the offsets with 256 tests selected by greedy uncorrelated-test search over candidate pairs on PASCAL 2006 keypoints. Naive steering alone collapses descriptor variance; the learned offset table is what restores it. ORB's patch and sub-window sizes also differ from BRIEF's patch — the two descriptors are not bit-for-bit comparable.
References
- M. Calonder, V. Lepetit, C. Strecha, P. Fua. BRIEF: Binary Robust Independent Elementary Features. Lecture Notes in Computer Science, 2010. link.springer.com
- E. Rosten, T. Drummond. Machine Learning for High-Speed Corner Detection. European Conference on Computer Vision, 2006.
- H. Bay, T. Tuytelaars, L. Van Gool. SURF: Speeded Up Robust Features. Lecture Notes in Computer Science, 2006.
- D. G. Lowe. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 2004.