Statistics From Scratch

easy · statistics, mean, variance, std

Statistics From Scratch

Implement fundamental statistical functions from scratch. These are essential for understanding data and building ML models.

Functions to implement

1. mean(data)

Compute the arithmetic mean (average).

  • Input: A list of numbers
  • Output: The sum divided by the count

2. variance(data, population=True)

Compute the variance.

  • Input: A list of numbers, and whether to use population variance
  • Output: Average squared deviation from mean
  • Population variance divides by n, sample variance divides by (n-1)

3. std(data, population=True)

Compute the standard deviation.

  • Input: A list of numbers, and whether to use population std
  • Output: Square root of variance

4. median(data)

Compute the median (middle value).

  • Input: A list of numbers
  • Output: The middle value (or average of two middle values)

5. correlation(x, y)

Compute the Pearson correlation coefficient.

  • Input: Two lists of numbers of equal length
  • Output: A value between -1 and 1

Examples

mean([1, 2, 3, 4, 5])         # 3.0
variance([1, 2, 3, 4, 5])     # 2.0 (population)
std([1, 2, 3, 4, 5])          # ~1.41
median([1, 2, 3, 4, 5])       # 3
median([1, 2, 3, 4])          # 2.5
correlation([1, 2, 3], [1, 2, 3])  # 1.0 (perfect positive)

Notes

  • Do not use NumPy or statistics library
  • You may use math.sqrt
  • Handle edge cases appropriately
Run tests to see results
No issues detected