2 minute read

What Is Linear Regression (in ML terms)?

Linear regression is a supervised machine learning algorithm that tries to model the relationship between input variables (features) and an output variable (label) by fitting a straight line to the data.

General Form Equation (Simple Linear Regression)

For one input feature x, the model is:

\[y = wx + b\]

Let’s say you want to predict someone’s weight based on their height:

  • Feature: height (in cm)
  • Label: weight (in kg)

Where:

  • $y$: is the label (target/output variable) — in your example, weight
  • $x$: is the feature (input variable) — in your example, height
  • $w$: is the weight (slope) — how much $y$ changes when $x$ increases
  • $b$: is the bias (intercept) — the value of $y$ when $x = 0$

Find the best values of $w$ and $b$ such that the predicted values $\hat{y}$ are as close as possible to the actual values $y$.

Linear regression form: weight = $w \cdot \text{height} + b$

Machine Learning Notation:

\[f_{w,b}(x) = wx + b\]

In machine learning, we use $f_{w,b}(x)$ because:

  • Function notation: Emphasizes that this is a function that takes x as input
  • Parameter subscripts: The subscripts {w,b} show which parameters the function depends on
  • Prediction emphasis: Makes it clear this is a prediction/estimate, not the true y value

This matches: $y = wx + b$

So, when doing machine learning:

  • $x$ can represent any input (feature), like height, age, number of hours studied, etc.
  • $y$ is the prediction the model makes.
  • $w$ and $b$ are what the model learns from training data.

Step 1: Sample Data

Person Height (cm) Weight (kg)
A 160 50
B 165 55
C 170 60
D 175 65
E 180 70

Here:

  • height is the feature (input).
  • weight is the label (output).

Step 2: Goal

Find the best-fit line:

\[\text{weight} = w \cdot \text{height} + b\]

Step 3: Use the Formulas

To calculate $w$ and $b$, we use the least squares method:

\[w = \frac{n\sum(x_i \cdot y_i) - \sum x_i \sum y_i}{n\sum x_i^2 - (\sum x_i)^2}\] \[b = \frac{\sum y_i - w \sum x_i}{n}\]

Where:

  • $x_i$ = height of the i-th person
  • $y_i$ = weight of the i-th person
  • $n$ = number of data points
  • $\Sigma$ = summation (e.g., $\Sigma x_i = x_1 + x_2 + … + x_n$)

Step 4: Plug in the Values Let’s calculate the required sums:

Height \((x)\) Weight \((y)\) \(x \cdot y\) \(x^2\)  
160 50 8000 25600  
165 55 9075 27225  
170 60 10200 28900  
175 65 11375 30625  
180 70 12600 32400  
\(\Sigma\) 850 300 51250 144750
  • \(n = 5\)
  • \(\sum x_i = 850\)
  • \(\sum y_i = 300\)
  • \(\sum x_i y_i = 51250\)
  • \(\sum x_i^2 = 144750\)

Calculate slope \(w\):

\(w = \frac{5 \cdot 51250 - 850 \cdot 300}{5 \cdot 144750 - 850^2} = \frac{256250 - 255000}{723750 - 722500} = \frac{1250}{1250} = 1\)

Calculate intercept \(b\):

\(b = \frac{300 - 1 \cdot 850}{5} = \frac{-550}{5} = -110\)

Final Linear Equation

\(\boxed{\text{weight} = 1 \cdot \text{height} - 110}\)

This means: for a given height (in cm), you can estimate the weight (in kg) with this formula.

Example Use

If someone’s height is 172 cm:

\(\text{weight} = 1 \cdot 172 - 110 = 62 \text{ kg}\)