Compression Foundations & Entropy
- Compression = modeling + coding
- Entropy is the theoretical floor
- Prefix-free codes are feasible iff Kraft holds
- Expected length connects theory to code
1 / 8
Information content
- I(x) = -log2(p)
- Rare events carry more bits
- Certain events carry 0 bits
2 / 8
Entropy
- H = -sum p_i log2(p_i)
- Minimum average bits any code can achieve
- Only a bound if model is correct
3 / 8
Expected code length
- L = sum p_i * l_i
- Shannon: H <= L < H + 1 (for optimal prefix codes)
- Redundancy = L - H
4 / 8
Kraft's inequality
- sum 2^{-l_i} <= 1
- Determines which length sets are possible
- Prefix-free implies decodability
5 / 8
Modeling mismatch
- Cross-entropy: H(p, q) = -sum p log2 q
- Wrong model -> extra bits
- Model quality dominates compression
6 / 8
Greedy vs global
- Local greedy choices can be suboptimal
- Tokenization and coding costs interact
- Watch for hidden cost tradeoffs
7 / 8
What you will build
- Entropy estimator
- Kraft checker
- Later: Huffman, LZ, BWT, range coding
8 / 8
Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.