Autograd: Scalar Reverse-Mode

Lesson, slides, and applied problem sets.

Lesson

Autograd: Scalar Reverse-Mode

Goal

Build a tiny scalar autograd engine. The forward pass builds a DAG of Value nodes; the backward pass applies the chain rule to compute gradients.

Prerequisites: ML Foundations pack.

1) The `Value` node

Each Value stores:

data: float
grad: float accumulator for dL/d(this)
_prev: set of parent Values
_op: debug label
_backward: closure to push gradients to parents

Everything else (vectors, matrices) is just lists of Value.

2) Local derivatives on the forward pass

Every operation returns a new Value and defines how gradients flow:

# a + b
out = Value(a.data + b.data, (a, b), "+")

def _backward():
    a.grad += 1.0 * out.grad
    b.grad += 1.0 * out.grad
out._backward = _backward

# a * b
out = Value(a.data * b.data, (a, b), "*")

def _backward():
    a.grad += b.data * out.grad
    b.grad += a.data * out.grad
out._backward = _backward

Use += because a node might be used multiple times.

3) Backward pass (reverse topological order)

def backward(self):
    topo = []
    visited = set()

    def build(v):
        if v not in visited:
            visited.add(v)
            for child in v._prev:
                build(child)
            topo.append(v)

    build(self)
    self.grad = 1.0
    for node in reversed(topo):
        node._backward()

4) Supported ops

Arithmetic:

+, *, **, unary -, -, /
reverse ops: __radd__, __rmul__, __rsub__, __rtruediv__

Nonlinear:

tanh, relu, exp, log (only defined for data > 0)

All ops accept Value or Python numbers.

5) Gradient accumulation

Gradients accumulate by default:

for p in params:
    p.grad = 0.0
loss.backward()

This is intentional and enables minibatch accumulation.

Key takeaways

Autograd is a DAG + local derivatives.
Reverse topological order ensures correctness.
Gradients accumulate; you must zero them between steps.

Next: build Module, Linear, and Sequential to compose models.

Autograd: Scalar Reverse-Mode

Lesson

Autograd: Scalar Reverse-Mode

Goal

1) The `Value` node

2) Local derivatives on the forward pass

3) Backward pass (reverse topological order)

4) Supported ops

5) Gradient accumulation

Key takeaways

Module Items

Autograd Value: Build the Graph

Autograd Backward: Reverse-Mode Gradients

Lesson

Autograd: Scalar Reverse-Mode

Goal

1) The Value node

2) Local derivatives on the forward pass

3) Backward pass (reverse topological order)

4) Supported ops

5) Gradient accumulation

Key takeaways

Module Items

Autograd Value: Build the Graph

Autograd Backward: Reverse-Mode Gradients

1) The `Value` node