2 minute read

In machine learning, loss is the numerical heartbeat that tells us how well our model is performing. The fundamental goal of training any machine learning model is simple: minimize this loss to make our predictions as accurate as possible.

Imagine you’re trying to predict house prices. If your model predicts $300,000 but the actual sale price was $280,000, the loss quantifies that $20,000 difference. The smaller this difference across all predictions, the better your model performs.

The Core Concept: Distance, Not Direction

This is why all loss functions employ techniques to remove the sign from the error:

  • Taking the absolute value: \(|\text{actual} - \text{predicted}|\)
  • Squaring the difference: (prediction - actual)²

    Linear Regression Loss

  1. Mean Absolute Error (MAE) - L1 Loss
    • It simply averages the absolute differences between predictions and actual values.
      • \(\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |\text{actual}_i - \text{predicted}_i|\)
      • \(\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|\)
  2. Mean Squared Error (MSE) - L2 Loss
    • Mean Squared Error (MSE) is a loss function that measures the average of the squared differences between predicted and actual values.
      • \(\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (\text{actual}_i - \text{predicted}_i)^2\)
      • \(\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2\)

Calculating loss example

Let’s work through a concrete example with house price predictions to see how MAE and MSE are calculated.

Sample Data:

House Actual Price Predicted Price Error
1 $280,000 $300,000 -$20,000
2 $350,000 $330,000 $20,000
3 $420,000 $400,000 $20,000
4 $180,000 $190,000 -$10,000
4 $390,000 $400,000 -$10,000

Calculating Mean Absolute Error (MAE):

Step 1: Take the absolute value of each error:

  • \(|−20,000| = 20,000\)
  • \(|20,000| = 20,000\)
  • \(|20,000| = 20,000\)
  • \(|−10,000| = 10,000\)
  • \(|−10,000| = 10,000\)

Step 2: Calculate the mean: \(\text{MAE} = \frac{20,000 + 20,000 + 20,000 + 10,000 + 10,000}{5} = \frac{80,000}{5} = 16,000\)

Calculating Mean Squared Error (MSE):

Step 1: Square each error:

  • (−20,000)² = 400,000,000
  • (20,000)² = 400,000,000
  • (20,000)² = 400,000,000
  • (−10,000)² = 100,000,000
  • (−10,000)² = 100,000,000

Step 2: Calculate the mean: \(\text{MSE} = \frac{400,000,000 + 400,000,000 + 400,000,000 + 100,000,000 + 100,000,000}{5}\)

\(\text{MSE} = \frac{1,400,000,000}{5} = 280,000,000\)

Key Observations:

  1. Scale Difference: MAE gives us $16,000 while MSE gives us 280,000,000. MSE values are much larger because we’re squaring the errors.

  2. Sensitivity to Outliers: Notice how the $20,000 errors contribute proportionally more to MSE (400,000,000 each) compared to the $10,000 errors (100,000,000 each). In MAE, they contribute proportionally (20,000 vs 10,000).

  3. Practical Interpretation:

    • MAE tells us our predictions are off by an average of $16,000
    • MSE penalizes larger errors more heavily, making it useful when big mistakes are particularly costly

Root Mean Squared Error (RMSE):

To make MSE more interpretable, we can take its square root: \(\text{RMSE} = \sqrt{280,000,000} = 16,733\)

RMSE brings us back to the same units as our original data (dollars), making it easier to interpret than MSE.