Bit I/O & Universal Codes
Lesson, slides, and applied problem sets.
View SlidesLesson
Bit I/O & Universal Codes
Why this module exists
Most compression algorithms are bit-precise, not byte-precise. You need reliable bit I/O, clear bit ordering, and consistent padding rules, or everything breaks later. Universal codes (Elias, Golomb/Rice) show how to encode integers without a pre-shared model and reveal the real cost of lengths.
1) Bit ordering and alignment
We will use MSB-first bit order:
- The first bit written goes to the highest bit of the first byte.
- WriteBits(value, n) writes the n lowest bits of value, from MSB to LSB.
Padding:
- If the stream ends mid-byte, pad the remaining low bits with zeros.
- Decoders must tolerate trailing zero padding.
This is an engineering contract. Break it and every decoder fails.
2) Why bit I/O is fragile
Bit I/O bugs are subtle:
- Off-by-one on partial bytes
- Reversed bit order
- Accidental sign/shift bugs
- Mismatched padding on decode
The only defense is round-trip tests and carefully specified rules.
3) Universal integer codes
Universal codes encode positive integers without a full probability model.
Elias gamma
- Write
kzeros wherek = floor(log2 n). - Then write the binary representation of
n(lengthk+1).
Example (n=13 = 1101):
zeros: 000
bits: 1101
code: 0001101
Elias delta
- First encode
L = floor(log2 n) + 1using gamma. - Then append the lower
L-1bits ofn.
Delta is more efficient for large numbers.
4) What these codes teach
- Code length grows like
log2 n. - They are prefix-free.
- They are not optimal when you know a distribution.
Universal codes are the bridge between raw integers and structured models.
What you will build
- Bitstream I/O with MSB-first semantics.
- Elias gamma/delta encoders and decoders.
Key takeaways
- Bit order is a contract, not an implementation detail.
- Universal codes are simple but not optimal.
- Tests must include partial-byte boundaries.