Neural Network Abstractions
Lesson, slides, and applied problem sets.
View SlidesLesson
Neural Network Abstractions
Goal
Create a small PyTorch-like API for composing models:
Modulebase classLinearlayerSequentialcontainer- Activation modules
Prerequisites: Autograd module.
1) Module contract
A Module should:
- Be callable:
__call__ -> forward - Track parameters automatically
- Track nested modules recursively
- Support train/eval flags
2) Auto-registration via __setattr__
class Module:
def __init__(self):
object.__setattr__(self, "_parameters", {})
object.__setattr__(self, "_modules", {})
object.__setattr__(self, "training", True)
def __setattr__(self, name, value):
if isinstance(value, Value):
self._parameters[name] = value
elif isinstance(value, Module):
self._modules[name] = value
elif isinstance(value, (list, tuple)):
self._register_list(name, value)
object.__setattr__(self, name, value)
3) Parameters API
def parameters(self):
params = list(self._parameters.values())
for m in self._modules.values():
params.extend(m.parameters())
return params
def named_parameters(self, prefix=""):
for name, p in self._parameters.items():
full = f"{prefix}{name}" if prefix else name
yield full, p
for name, m in self._modules.items():
sub = f"{prefix}{name}." if prefix else f"{name}."
yield from m.named_parameters(sub)
4) Linear layer
Shapes:
- input
x: lengthin_features - weight:
(out_features, in_features) - bias:
(out_features,)orNone
Forward:
y_i = sum_j weight[i][j] * x[j] + bias[i]
Initialization: Xavier uniform with k = 1/sqrt(in_features).
5) Sequential + activations
Sequential applies modules in order. Activations are modules with no params:
Tanh,ReLU,Sigmoid
Key takeaways
Modulecentralizes parameter tracking and nesting.__setattr__is the hook that makes it automatic.Linear+ activations +Sequentialare enough to build MLPs.
Next: tokenization and batching for language modeling.