Multiclass Classification with Softmax

hard · classification, softmax, multiclass

Multiclass Classification with Softmax

Extend binary classification to multiple classes using softmax.

Background

Softmax converts raw scores (logits) into probabilities that sum to 1. Combined with cross-entropy loss, it's the standard approach for multiclass classification.

Functions to implement

1. `softmax(logits)`

Convert logits to probabilities.

Input: List of scores for each class
Output: Probabilities that sum to 1
Use the stable version: subtract max before exp

2. `cross_entropy_loss(y_true, y_pred)`

Compute categorical cross-entropy loss.

y_true: One-hot encoded labels (batch_size, num_classes)
y_pred: Predicted probabilities (batch_size, num_classes)

3. `softmax_gradient(y_true, y_pred)`

Compute gradient of cross-entropy loss w.r.t. logits.

Simplified: dL/dz = y_pred - y_true

4. `train_softmax_classifier(X, y, num_classes, epochs, lr)`

Train a softmax classifier using gradient descent.

y should be class indices (0 to num_classes-1)
Return weights W and bias b

Examples

softmax([2.0, 1.0, 0.1])  # [0.659, 0.242, 0.099]

# 3 samples, 3 classes
X = [[0, 0], [0, 1], [1, 1]]
y = [0, 1, 2]  # Class labels
W, b = train_softmax_classifier(X, y, 3, 1000, 0.1)

import math
from typing import List, Tuple

Matrix = List[List[float]]
Vector = List[float]

def softmax(logits: Vector) -> Vector:
    """TODO: Implement this function."""
    raise NotImplementedError()

def softmax_batch(logits: Matrix) -> Matrix:
    """TODO: Implement this function."""
    raise NotImplementedError()

def cross_entropy_loss(y_true: Matrix, y_pred: Matrix) -> float:
    """TODO: Implement this function."""
    raise NotImplementedError()

def one_hot_encode(labels: List[int], num_classes: int) -> Matrix:
    """TODO: Implement this function."""
    raise NotImplementedError()

def matmul(A: Matrix, B: Matrix) -> Matrix:
    """TODO: Implement this function."""
    raise NotImplementedError()

def add_bias(Z: Matrix, b: Vector) -> Matrix:
    """TODO: Implement this function."""
    raise NotImplementedError()

def transpose(A: Matrix) -> Matrix:
    """TODO: Implement this function."""
    raise NotImplementedError()

def train_softmax_classifier(
    X: Matrix,
    y: List[int],
    num_classes: int,
    epochs: int,
    learning_rate: float,
) -> Tuple[Matrix, Vector]:
    """TODO: Implement this function."""
    raise NotImplementedError()

def predict(X: Matrix, W: Matrix, b: Vector) -> List[int]:
    """TODO: Implement this function."""
    raise NotImplementedError()