Compression Foundations & Entropy

Compression = modeling + coding
Entropy is the theoretical floor
Prefix-free codes are feasible iff Kraft holds
Expected length connects theory to code

1 / 8

Information content

I(x) = -log2(p)
Rare events carry more bits
Certain events carry 0 bits

2 / 8

Entropy

H = -sum p_i log2(p_i)
Minimum average bits any code can achieve
Only a bound if model is correct

3 / 8

Expected code length

L = sum p_i * l_i
Shannon: H <= L < H + 1 (for optimal prefix codes)
Redundancy = L - H

4 / 8

Kraft's inequality

sum 2^{-l_i} <= 1
Determines which length sets are possible
Prefix-free implies decodability

5 / 8

Modeling mismatch

Cross-entropy: H(p, q) = -sum p log2 q
Wrong model -> extra bits
Model quality dominates compression

6 / 8

Greedy vs global

Local greedy choices can be suboptimal
Tokenization and coding costs interact
Watch for hidden cost tradeoffs

7 / 8

What you will build

Entropy estimator
Kraft checker
Later: Huffman, LZ, BWT, range coding

8 / 8

Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.