PCA Implementation

hard · pca, dimensionality-reduction, eigenvalues

PCA Implementation

Implement Principal Component Analysis from scratch.

Background

PCA finds the directions (principal components) of maximum variance in data. Steps:

  1. Center the data (subtract mean)
  2. Compute covariance matrix
  3. Find eigenvectors/eigenvalues
  4. Project data onto top k eigenvectors

Functions to implement

1. center_data(X)

Center the data by subtracting the mean of each feature.

2. covariance_matrix(X)

Compute the covariance matrix of centered data.

  • Cov = (1/n) * X.T @ X

3. power_iteration(A, num_iterations)

Find the dominant eigenvector using power iteration.

  • Start with random vector
  • Repeatedly: v = A @ v, then normalize v

4. pca(X, n_components)

Perform PCA and return transformed data.

  • Use deflation to find multiple components

5. transform(X, components, mean)

Project new data onto principal components.

Examples

X = [[1, 2], [3, 4], [5, 6]]
X_transformed, components, mean = pca(X, n_components=1)
# X_transformed has shape (3, 1)
# components has shape (1, 2) - the principal component directions
Run tests to see results
No issues detected