Huffman Coding

  • Optimal prefix code for known frequencies
  • Greedy tree construction
  • Canonical codes for deterministic decoding
1 / 6

Huffman tree

  • Repeatedly merge two lowest weights
  • Greedy step is optimal
  • Build full binary tree
2 / 6

Tie-breaking

  • Equal weights -> multiple valid trees
  • Must define a deterministic rule
  • Use min-symbol id tie-break
3 / 6

Codes from tree

  • Left edge = 0, right edge = 1
  • Path to leaf is the code
  • Single symbol -> length 1
4 / 6

Canonical Huffman

  • Sort by (length, symbol id)
  • Assign increasing integers
  • Shift when length grows
5 / 6

Why canonical

  • Header is just lengths
  • Deterministic and compact
  • Fast decode tables
6 / 6
Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.