PCA Implementation
PCA Implementation
Implement Principal Component Analysis from scratch.
Background
PCA finds the directions (principal components) of maximum variance in data. Steps:
- Center the data (subtract mean)
- Compute covariance matrix
- Find eigenvectors/eigenvalues
- Project data onto top k eigenvectors
Functions to implement
1. center_data(X)
Center the data by subtracting the mean of each feature.
2. covariance_matrix(X)
Compute the covariance matrix of centered data.
- Cov = (1/n) * X.T @ X
3. power_iteration(A, num_iterations)
Find the dominant eigenvector using power iteration.
- Start with random vector
- Repeatedly: v = A @ v, then normalize v
4. pca(X, n_components)
Perform PCA and return transformed data.
- Use deflation to find multiple components
5. transform(X, components, mean)
Project new data onto principal components.
Examples
X = [[1, 2], [3, 4], [5, 6]]
X_transformed, components, mean = pca(X, n_components=1)
# X_transformed has shape (3, 1)
# components has shape (1, 2) - the principal component directions
Run tests to see results
No issues detected