Autograd: Automatic Differentiation (Scalar)

1 / 6

Value node

Value stores:

  • data (float)
  • grad (accumulator)
  • _prev (parents), _op (debug), _backward (local chain rule)

Forward pass builds a DAG of Values.

2 / 6

Local derivatives

Example:

  • a + b: da = 1 out.grad, db = 1 out.grad
  • a * b: da = b.data out.grad, db = a.data out.grad

Use += because a node can be used multiple times.

3 / 6

Backward algorithm

  1. DFS to build topological order
  2. Set self.grad = 1.0
  3. Traverse in reverse and call _backward
4 / 6

Gradient accumulation

Gradients accumulate by design:

  • a + a should give a.grad = 2
  • Zero grads before each new backward
5 / 6

Sanity check

f = (a * b + c) ** 2

  • df/da = 2(ab + c)*b
  • df/db = 2(ab + c)*a
  • df/dc = 2(ab + c)
6 / 6
Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.