Model Evaluation

Train/Test Split

  • Never evaluate on training data
  • Typical: 80% train, 20% test
  • Overfitting hides in train metrics

Confusion Matrix

  • TN, FP, FN, TP
  • Shows error types
  • Foundation for other metrics

Precision

  • TP / (TP + FP)
  • Of predicted positives, how many correct?
  • High when FP costly

Recall

  • TP / (TP + FN)
  • Of actual positives, how many found?
  • High when FN costly

F1 Score

  • 2 × P × R / (P + R)
  • Harmonic mean
  • Good for imbalanced data

ROC and AUC

  • TPR vs FPR at various thresholds
  • AUC: single number summary
  • 1.0 = perfect, 0.5 = random

Cross-Validation

  • K-fold: train/test K times
  • Average scores
  • More robust than single split
1 / 1
Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.