Autograd Backward: Reverse-Mode Gradients

medium · autograd, backpropagation, topological-sort

Autograd Backward: Reverse-Mode Gradients

Extend the Value class to compute gradients via reverse-mode autodiff.

What you are building

You will implement:

  • Value.backward() with reverse topological traversal
  • Activation ops: tanh, relu, exp, log

Methods to implement

1) Value.backward(self)

Compute gradients for all nodes reachable from self.

Requirements:

  • Build a topological ordering of nodes via DFS
  • Set self.grad = 1.0 (dL/dL)
  • Traverse nodes in reverse topological order and call node._backward()

2) Value.tanh(self)

  • tanh(x) = (e^(2x) - 1) / (e^(2x) + 1)
  • Derivative: 1 - tanh(x)^2

3) Value.relu(self)

  • relu(x) = max(0, x)
  • Derivative: 1 if x > 0, else 0

4) Value.exp(self)

  • exp(x) = e^x
  • Derivative: exp(x)

5) Value.log(self)

  • log(x) = ln(x)
  • Derivative: 1/x
  • Raise ValueError if x <= 0

Example

a = Value(2.0)
b = Value(-3.0)
c = Value(10.0)

f = (a * b + c) ** 2
f.backward()

assert a.grad == -24.0
assert b.grad == 16.0
assert c.grad == 8.0

Hints

  • Use += for gradient accumulation.
  • Zero grads manually when running backward repeatedly.
Run tests to see results
No issues detected