Logs & Streaming
Lesson, slides, and applied problem sets.
View SlidesLesson
Logs & Streaming
Why logs
Logs are the backbone of streaming systems: durable, ordered, replayable. A log gives you a single source of truth that many consumers can read at their own pace.
Offsets and commits
Consumers process records and periodically commit offsets so the system knows what is safely processed. Commit only contiguous offsets to avoid gaps.
Exactly-once processing
Exactly-once semantics require transactions, idempotent writes, and fencing. The system must ignore duplicates, publish records atomically, and reject stale producers.
Consumer groups
Partitions are assigned across consumers. Good assignment minimizes rebalances and balances load.
What you will build
- Commit offset advancement
- Consumer group partition assignment
- Exactly-once transactional processing
- Consumer lag + watermark metrics
- Windowed aggregation with watermarks
- Log compaction by key
- Idempotent producer acceptance
- Rebalance planning (partition moves)
Module Items
Offset Commit Advancement
Advance a commit offset using contiguous processed offsets.
Consumer Group Partition Assignment
Assign partitions to consumers in round-robin order.
Consumer Rebalance Plan
Streaming Metrics: Lag & Watermark
Windowed Aggregation with Watermarks
Idempotent Producer
Log Compaction
Exactly-Once Streaming Transactions