Simple Linear Regression
Core Titles
Key headlines and terms for quick recall- Simple Linear Regression (SLR) — one predictor
- Model:
- Goal: minimise Sum of Squared Errors (SSE)
- OLS — Ordinary Least Squares
- Closed-form: ,
- — coefficient of determination
- Assumptions: linearity, independence, homoscedasticity, normality
Basic Idea
What it is, why it matters, how it worksWhat it is
Simple Linear Regression (SLR) models the relationship between one predictor and a continuous response as a straight line:
- — intercept (predicted when ).
- — slope (change in per unit change in ).
Goal — least squares
We want the line that minimises the sum of squared residuals:
This is Ordinary Least Squares (OLS).
Closed-form solution
Setting and :
Equivalently , and where is correlation.
Goodness of fit —
Ranges 0 to 1; fraction of variance explained by the model.
For SLR, (Pearson correlation squared).
Assumptions (LINE)
- Linearity — relationship truly is linear.
- Independence — observations independent.
- Normality — residuals approximately Gaussian.
- Equal variance (homoscedasticity) — constant variance of residuals.
Violations show up in residual plots.
Residuals
. A residual plot of vs should look like random noise:
- A funnel → heteroscedasticity.
- A U-shape → missed non-linearity.
- A trend → biased model.
Worked example
Hours studied ; marks .
- .
- .
- .
- .
- .
. So each extra hour of study adds ~4.6 marks.
When SLR fails
- Non-linear relationships (use polynomial / non-linear regression).
- Multiple drivers (use multiple regression).
- Categorical predictors (encode).
- Heavy outliers (use robust regression).
Why it matters in data science
- Simplest interpretable model — strong baseline.
- Foundation for all linear methods (Logistic Regression, GLMs).
- Closed-form solution makes it instantly trainable on huge datasets.
Mind Map
Visual structure of the conceptSIMPLE LINEAR REGRESSION
├── Model: ŷ = β₀ + β₁ x
├── Method: OLS (minimise SSE)
├── Closed-form
│ ├── β₁ = Σ(x−x̄)(y−ȳ) / Σ(x−x̄)²
│ └── β₀ = ȳ − β₁ x̄
├── R² = 1 − SSE/SST
├── Assumptions (LINE)
│ ├── Linearity
│ ├── Independence
│ ├── Normality of residuals
│ └── Equal variance (homoscedasticity)
└── Diagnostics
├── Residual plot
│ ├── U-shape → non-linear
│ ├── Funnel → heteroscedasticity
│ └── Trend → biased
└── Q-Q plot for normality
Exam Q&A
Part A (2 marks) and Part B (20 marks) style questionsPart A (2 marks each)
Q1. Write the equation of simple linear regression. , where is the intercept and is the slope.
Q2. What is the least-squares criterion? Choose to minimise the sum of squared residuals .
Q3. What does represent? The fraction of variance in explained by the regression model: . Ranges 0 to 1; for SLR , where is the Pearson correlation.
Part B (20 marks)
Q. Explain Simple Linear Regression. Derive the OLS estimates of and . Apply to the data: hours studied , marks . Discuss assumptions and interpretation of .
Model. , with residual .
Goal. Minimise .
Derivation of OLS estimates.
Take partial derivatives and set to zero:
From the first equation: .
Substitute into the second:
Both sides equal centred sums, so:
Application — given data.
| 1 | 52 | −2 | −8.4 | 16.8 | 4 |
| 2 | 55 | −1 | −5.4 | 5.4 | 1 |
| 3 | 60 | 0 | −0.4 | 0 | 0 |
| 4 | 65 | 1 | 4.6 | 4.6 | 1 |
| 5 | 70 | 2 | 9.6 | 19.2 | 4 |
| Sum | 302 | 46.0 | 10 |
.
. .
Fitted line: . Each extra hour of study adds ~4.6 marks.
Predictions and residuals.
| residual | |||
|---|---|---|---|
| 1 | 52 | 51.2 | 0.8 |
| 2 | 55 | 55.8 | −0.8 |
| 3 | 60 | 60.4 | −0.4 |
| 4 | 65 | 65.0 | 0.0 |
| 5 | 70 | 69.6 | 0.4 |
SSE = .
SST = .
.
Interpretation. — the model explains nearly all of the variance in marks. Excellent linear fit. (Cross-validate before trusting on new students.)
Assumptions (LINE).
- Linearity — relationship truly linear.
- Independence — observations independent (students don't copy answers).
- Normality of residuals — needed for inference (CIs, p-values), not for point estimates.
- Equal variance (homoscedasticity) — constant spread of residuals.
Diagnostic plots.
- Residual vs fitted plot should look like random noise. A U-shape → missed non-linearity (try ). A funnel → heteroscedasticity (try log-transform).
- Q-Q plot of residuals — checks normality.