Expectation: Mean, Variance, Covariance, Conditional Expectation
Core Titles
Key headlines and terms for quick recall- Expectation or
- Linearity
- Variance
- Covariance
- Conditional Expectation
- Law of Total Expectation
- Law of Total Variance
Basic Idea
What it is, why it matters, how it worksExpectation (mean)
Centre of mass of a distribution:
Law of the Unconscious Statistician (LOTUS). For : or .
Linearity (always holds)
No independence needed.
Variance
Spread around the mean. Standard deviation .
Properties.
- If :
Covariance
Symmetric, bilinear. Correlation .
Conditional expectation
A function of . Viewed as a random variable .
Law of Total Expectation (Adam's law)
Average the conditional means over .
Law of Total Variance (Eve's law)
Within-group variance + between-group variance.
Why this matters in Data Science
Mean and variance summarise distributions. Bias / variance trade-off lives here. Conditional expectation = optimal predictor under squared loss — the foundation of regression.
Mind Map
Visual structure of the conceptEXPECTATION & MOMENTS
├── E[X] center of mass
├── LOTUS: E[g(X)] without finding dist of g(X)
├── Linearity (always): E[aX + bY] = aE[X] + bE[Y]
├── Var(X) = E[X²] − (E[X])²
│ ├── Var(aX + b) = a² Var(X)
│ └── Var(X + Y) = Var(X) + Var(Y) + 2Cov
├── Cov(X, Y) = E[XY] − E[X]E[Y]
├── Conditional Expectation E[X|Y]
├── Total Expectation: E[X] = E[E[X|Y]]
└── Total Variance: Var(X) = E[Var(X|Y)] + Var(E[X|Y])
Exam Q&A
Part A (2 marks) and Part B (20 marks) style questionsPart A (2 marks each)
Q1. State linearity of expectation. , regardless of independence.
Q2. Define variance. .
Q3. Define covariance. .
Q4. State the law of total expectation. .
Part B (20 marks)
Q. Derive the formula . State and prove the linearity of expectation. State the laws of total expectation and total variance. Compute mean and variance of using linearity.
Variance identity. Since :
Linearity of expectation. Theorem. For RVs and scalars : .
Proof (continuous). .
(Discrete case identical with sums.) No independence required. ∎
Total expectation. .
Total variance. . (Within-group variance + between-group variance.)
Binomial mean and variance via linearity.
Write where are independent.
Mean. . By linearity: .
Variance. . By independence: .
So has mean and variance . ✓
Sanity check. If or , variance is 0 (deterministic). Variance is maximised at . Intuitively: the most "random" Bernoulli is the fair coin.