Neural Network Abstractions: Module, Linear, Sequential
Lesson, slides, and applied problem sets.
View SlidesLesson
Neural Network Abstractions: Module, Linear, Sequential
Goal
Replace ad-hoc classes with a small framework that matches PyTorch's mental model:
Modulebase class with automatic parameter trackingLinearlayer and common activationsSequentialcontainer for composition
Prerequisites: Autograd module.
1) The Module contract
A Module should:
- Be callable via
__call__ -> forward - Track parameters (
Values) automatically - Track nested modules recursively
- Support train/eval mode flags
Minimal state:
_parameters: dict[str, Value]_modules: dict[str, Module]training: bool
2) Auto-registration via __setattr__
When a user assigns attributes, intercept and register:
class Module:
def __init__(self):
object.__setattr__(self, "_parameters", {})
object.__setattr__(self, "_modules", {})
object.__setattr__(self, "training", True)
def __setattr__(self, name, value):
if isinstance(value, Value):
self._parameters[name] = value
elif isinstance(value, Module):
self._modules[name] = value
elif isinstance(value, (list, tuple)):
self._register_list(name, value)
object.__setattr__(self, name, value)
Nested lists/tuples should register recursively (e.g., list of layers).
3) Parameter traversal
def parameters(self):
params = list(self._parameters.values())
for module in self._modules.values():
params.extend(module.parameters())
return params
def named_parameters(self, prefix=""):
for name, param in self._parameters.items():
full = f"{prefix}{name}" if prefix else name
yield full, param
for name, module in self._modules.items():
sub = f"{prefix}{name}." if prefix else f"{name}."
yield from module.named_parameters(sub)
4) Linear layer (fully connected)
Shapes:
- input
x: lengthin_features - weight:
(out_features, in_features) - bias:
(out_features,)orNone
Forward:
y_i = sum_j weight[i][j] * x[j] + bias[i]
Initialization: Xavier uniform with k = 1/sqrt(in_features).
5) Sequential container
Sequential stores modules in order and feeds outputs through each:
class Sequential(Module):
def __init__(self, *modules):
super().__init__()
self.module_list = list(modules)
for i, m in enumerate(modules):
setattr(self, f"layer_{i}", m)
def forward(self, x):
for m in self.module_list:
x = m(x)
return x
6) Activation modules
Each activation maps a list of Value to a list of Value:
Tanh:xi.tanh()ReLU:xi.relu()Sigmoid:1 / (1 + exp(-x))
These have no parameters but still behave like Modules.
7) Train/Eval mode
def train(self, mode=True):
self.training = mode
for m in self._modules.values():
m.train(mode)
return self
def eval(self):
return self.train(False)
Used by layers like dropout/batchnorm (not implemented here, but the flag is part of the API).
Key takeaways
Modulecentralizes parameter tracking and nesting.__setattr__is the hook that makes it automatic.Linear+ activations +Sequentialare enough for many models.- Train/eval mode is an API contract, even if unused now.
Next: embeddings and positional encodings for sequence models.
Module Items
Neural Network Module: The Foundation of DL Abstractions
Implement the Module base class with automatic parameter tracking.
Linear Layer and Sequential: Building Architectures
Implement Linear, Sequential, and activation modules.
Neural Network Abstractions Checkpoint
Test your understanding of Module, Linear, and Sequential.