Compression Foundations & Entropy

  • Compression = modeling + coding
  • Entropy is the theoretical floor
  • Prefix-free codes are feasible iff Kraft holds
  • Expected length connects theory to code
1 / 8

Information content

  • I(x) = -log2(p)
  • Rare events carry more bits
  • Certain events carry 0 bits
2 / 8

Entropy

  • H = -sum p_i log2(p_i)
  • Minimum average bits any code can achieve
  • Only a bound if model is correct
3 / 8

Expected code length

  • L = sum p_i * l_i
  • Shannon: H <= L < H + 1 (for optimal prefix codes)
  • Redundancy = L - H
4 / 8

Kraft's inequality

  • sum 2^{-l_i} <= 1
  • Determines which length sets are possible
  • Prefix-free implies decodability
5 / 8

Modeling mismatch

  • Cross-entropy: H(p, q) = -sum p log2 q
  • Wrong model -> extra bits
  • Model quality dominates compression
6 / 8

Greedy vs global

  • Local greedy choices can be suboptimal
  • Tokenization and coding costs interact
  • Watch for hidden cost tradeoffs
7 / 8

What you will build

  • Entropy estimator
  • Kraft checker
  • Later: Huffman, LZ, BWT, range coding
8 / 8
Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.