Go Binary Protocols

Lesson, slides, and applied problem sets.

View Slides

Lesson

Go Binary Protocols: Framing, Varints, Zero‑Copy Parsing

This topic is about building fast, allocation‑free binary protocols in Go: how you frame messages, parse them incrementally, and keep memory use low while staying correct.

1) Framing is everything

Binary protocols need an unambiguous way to split a byte stream into messages:

  • Length‑prefix (varint or fixed width)
  • Delimiter (e.g., \n)
  • Out‑of‑band (size known via transport)

Length‑prefixing is common because it works over streaming transports and allows efficient skipping.

2) Varints: compact length encoding

Varints store small numbers in fewer bytes. Typical encoding:

  • 7 data bits per byte
  • top bit (0x80) means “more bytes follow”

Benefits:

  • small messages are cheap to encode/decode
  • good for mixed workloads

Risks:

  • malformed sequences (too many bytes)
  • overflow in 64‑bit integers

3) Incremental parsing (stream‑safe)

Real systems read from sockets in chunks. Your decoder must handle:

  • partial length (not enough bytes to decode the varint)
  • partial payload (length decoded but body not fully available)

Return a “need more data” error and consume nothing until the message is complete.

4) Zero‑copy parsing

To avoid allocations, return slices into the input buffer rather than copying:

  • payload := buf[n:n+len]
  • no string() conversions on hot paths

If you must retain data after the buffer is reused, copy explicitly and measure the cost.

5) Safety and limits

Always enforce a maximum frame size:

  • protects memory
  • prevents DoS from malicious sizes

If the length is too large, fail fast.

6) Checksums and versioning

Production protocols usually add:

  • a version byte or header
  • an integrity check (CRC32, xxhash)

These are cheap compared to reprocessing corrupted streams.

7) Go performance notes

  • Avoid encoding/binary on hot paths if you can parse directly from bytes.
  • Preallocate buffers for reads and reuse them.
  • Measure allocations with -benchmem and keep them at zero for parsers.

Module Items