Feature Engineering

Why Scale Features?

  • Gradient descent converges faster
  • Distance metrics work correctly
  • Regularization applies equally

Min-Max Normalization

  • x' = (x - min) / (max - min)
  • Scales to [0, 1]
  • Use for bounded inputs

Standardization

  • z = (x - mean) / std
  • Mean=0, Std=1
  • Use for different units

One-Hot Encoding

  • Binary column per category
  • [0, 1, 0] for "blue"
  • No false ordering

Missing Values

  • Drop rows (loses data)
  • Impute with mean/mode
  • Add "was missing" indicator

Feature Creation

  • Polynomial: x², x₁×x₂
  • Log transform for skewed data
  • Domain-specific combinations

Feature Selection

  • Filter: correlation, mutual info
  • Wrapper: try feature subsets
  • Embedded: L1 regularization
1 / 1
Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.