Arithmetic / Range Coding

Lesson, slides, and applied problem sets.

Lesson

Arithmetic / Range Coding

Why this module exists

Huffman assigns whole-bit codewords. Arithmetic/range coding pushes closer to entropy by encoding symbols into a fractional interval. It is the natural endpoint of the theory: probabilities become exact code lengths (in the limit).

Arithmetic coding maintains a range [low, high] inside [0,1). Each symbol narrows the range proportional to its probability. The final interval uniquely identifies the message.

We simulate this with fixed-precision integers:

Start with [0, 2^N - 1]
Update using cumulative frequencies
Emit bits whenever the interval is stable

2) Underflow (the tricky part)

When the interval straddles the midpoint, we can't emit a bit yet. We track pending bits until the interval moves away from the center.

This is the arithmetic coding equivalent of carry handling in range coders.

3) Static model

We will use a static frequency table supplied to both encoder and decoder. This isolates the core arithmetic coding logic without adaptive modeling complexity.

4) Why this matters

Expected length approaches H within fractions of a bit.
Coding cost matches probabilities more precisely than Huffman.
Real compressors use range coding with adaptive models.

What you will build

A static arithmetic coder (bit-precise)
Round-trip tests and adversarial cases

Key takeaways

Arithmetic coding is model-driven and nearly optimal.
Correct handling of renormalization and underflow is essential.
Small mistakes produce silent corruption.

Arithmetic / Range Coding

Lesson

Arithmetic / Range Coding

Why this module exists

1) Interval refinement

2) Underflow (the tricky part)

3) Static model

4) Why this matters

What you will build

Key takeaways

Module Items

Static Arithmetic Coding

Arithmetic / Range Coding Checkpoint