Linear Regression: A Complete Guide with Examples

2 minute read

What Is Linear Regression (in ML terms)?

Linear regression is a supervised machine learning algorithm that tries to model the relationship between input variables (features) and an output variable (label) by fitting a straight line to the data.

General Form Equation (Simple Linear Regression)

For one input feature x, the model is:

\[y = wx + b\]

Let’s say you want to predict someone’s weight based on their height:

Feature: height (in cm)
Label: weight (in kg)

Where:

$y$: is the label (target/output variable) — in your example, weight
$x$: is the feature (input variable) — in your example, height
$w$: is the weight (slope) — how much $y$ changes when $x$ increases
$b$: is the bias (intercept) — the value of $y$ when $x = 0$

Find the best values of $w$ and $b$ such that the predicted values $\hat{y}$ are as close as possible to the actual values $y$.

Linear regression form: weight = $w \cdot \text{height} + b$

Machine Learning Notation:

\[f_{w,b}(x) = wx + b\]

In machine learning, we use $f_{w,b}(x)$ because:

Function notation: Emphasizes that this is a function that takes x as input
Parameter subscripts: The subscripts {w,b} show which parameters the function depends on
Prediction emphasis: Makes it clear this is a prediction/estimate, not the true y value

This matches: $y = wx + b$

So, when doing machine learning:

$x$ can represent any input (feature), like height, age, number of hours studied, etc.
$y$ is the prediction the model makes.
$w$ and $b$ are what the model learns from training data.

Step 1: Sample Data

Person	Height (cm)	Weight (kg)
A	160	50
B	165	55
C	170	60
D	175	65
E	180	70

Here:

height is the feature (input).
weight is the label (output).

Step 2: Goal

Find the best-fit line:

\[\text{weight} = w \cdot \text{height} + b\]

Step 3: Use the Formulas

To calculate $w$ and $b$, we use the least squares method:

\[w = \frac{n\sum(x_i \cdot y_i) - \sum x_i \sum y_i}{n\sum x_i^2 - (\sum x_i)^2}\] \[b = \frac{\sum y_i - w \sum x_i}{n}\]

Where:

$x_i$ = height of the i-th person
$y_i$ = weight of the i-th person
$n$ = number of data points
$\Sigma$ = summation (e.g., $\Sigma x_i = x_1 + x_2 + … + x_n$)

Step 4: Plug in the Values Let’s calculate the required sums:

Height $(x)$	Weight $(y)$	$x \cdot y$	$x^2$
160	50	8000	25600
165	55	9075	27225
170	60	10200	28900
175	65	11375	30625
180	70	12600	32400
$\Sigma$	850	300	51250	144750

$n = 5$
$\sum x_i = 850$
$\sum y_i = 300$
$\sum x_i y_i = 51250$
$\sum x_i^2 = 144750$

Calculate slope $w$:

$w = \frac{5 \cdot 51250 - 850 \cdot 300}{5 \cdot 144750 - 850^2} = \frac{256250 - 255000}{723750 - 722500} = \frac{1250}{1250} = 1$

Calculate intercept $b$:

$b = \frac{300 - 1 \cdot 850}{5} = \frac{-550}{5} = -110$

Final Linear Equation

$\boxed{\text{weight} = 1 \cdot \text{height} - 110}$

This means: for a given height (in cm), you can estimate the weight (in kg) with this formula.

Example Use

If someone’s height is 172 cm:

$\text{weight} = 1 \cdot 172 - 110 = 62 \text{ kg}$

Share on

X Facebook LinkedIn Bluesky

KHAN ABDULLAH

Linear Regression: A Complete Guide with Examples

What Is Linear Regression (in ML terms)?

Step 2: Goal

Step 3: Use the Formulas

Calculate slope \(w\):

Calculate intercept \(b\):

Final Linear Equation

Example Use

Share on

You may also enjoy

Understanding Loss Functions in Linear Regression

Likelihood vs. Probability, What’s the Difference (and Why It Matters in Machine Learning)?

What is NetworkPolicy in kubernetes?

What is Ingress