PGD01C01
Module 5 · Probability Theory

Multivariate Random Variables and Joint Distributions

Core Titles
Key headlines and terms for quick recall
  • Joint PMF/PDF p(x,y),f(x,y)p(x, y), f(x, y)
  • Marginal distributions — sum / integrate out
  • Conditional distributions f(xy)=f(x,y)/fY(y)f(x | y) = f(x, y) / f_Y(y)
  • Independence f(x,y)=fX(x)fY(y)f(x, y) = f_X(x) f_Y(y)
  • Covariance, Correlation
  • Multivariate Normal N(μ,Σ)\mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma})
  • Box–Muller / Cholesky for generation
Basic Idea
What it is, why it matters, how it works

Joint distributions

For two RVs X,YX, Y:

  • Discrete: joint PMF p(x,y)=P(X=x,Y=y)p(x, y) = P(X = x, Y = y), with x,yp(x,y)=1\sum_{x, y} p(x,y) = 1.
  • Continuous: joint PDF f(x,y)0f(x, y) \ge 0, with f(x,y)dxdy=1\iint f(x, y) \, dx \, dy = 1.

Marginals

fX(x)=f(x,y)dy,fY(y)=f(x,y)dx.f_X(x) = \int f(x, y) \, dy, \quad f_Y(y) = \int f(x, y) \, dx. Sum out the other variable.

Conditionals

fXY(xy)=f(x,y)fY(y)(fY(y)>0).f_{X|Y}(x | y) = \frac{f(x, y)}{f_Y(y)} \quad (f_Y(y) > 0).

Independence

XX and YY are independent iff f(x,y)=fX(x)fY(y)for all x,y.f(x, y) = f_X(x) f_Y(y) \quad \text{for all } x, y. Equivalently fXY(xy)=fX(x)f_{X|Y}(x|y) = f_X(x).

Covariance and correlation

Cov(X,Y)=E[(XμX)(YμY)]=E[XY]E[X]E[Y].\text{Cov}(X, Y) = E[(X - \mu_X)(Y - \mu_Y)] = E[XY] - E[X] E[Y]. Corr(X,Y)=ρ=Cov(X,Y)σXσY,1ρ1.\text{Corr}(X, Y) = \rho = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y}, \quad -1 \le \rho \le 1.

Independence ⇒ Cov=0\text{Cov} = 0. The converse is false in general (true for jointly normal).

Multivariate normal

A random vector X=(X1,,Xn)\mathbf{X} = (X_1, \dots, X_n) is multivariate normal with mean μ\boldsymbol{\mu} and covariance matrix Σ\boldsymbol{\Sigma}: f(x)=1(2π)n/2Σ1/2exp ⁣(12(xμ)TΣ1(xμ)).f(\mathbf{x}) = \frac{1}{(2\pi)^{n/2} |\boldsymbol{\Sigma}|^{1/2}} \exp\!\left( -\frac{1}{2} (\mathbf{x} - \boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right).

Generating multivariate samples

  • Box–Muller: two independent uniforms → two independent standard normals.
  • Cholesky for multivariate normal: if Σ=LLT\boldsymbol{\Sigma} = L L^T and ZN(0,I)Z \sim \mathcal{N}(0, I), then X=μ+LZN(μ,Σ)X = \boldsymbol{\mu} + L Z \sim \mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma}).

Why this matters in Data Science

Most real datasets are multivariate. Covariance / correlation underlies PCA, linear regression, Gaussian mixture models, and many classifiers.

Mind Map
Visual structure of the concept
MULTIVARIATE RVs
├── Joint f(x, y) or p(x, y)
├── Marginal: integrate / sum out
├── Conditional: f(x|y) = f(x,y)/f_Y(y)
├── Independence: f = f_X · f_Y
├── Covariance
│   ├── Cov(X,Y) = E[XY] − E[X]E[Y]
│   └── Indep ⇒ Cov=0  (converse not always)
├── Correlation ρ ∈ [−1, 1]
├── Multivariate Normal 𝒩(μ, Σ)
└── Generation
    ├── Box-Muller → standard normals
    └── Cholesky Σ = LLᵀ ⇒ X = μ + LZ
Exam Q&A
Part A (2 marks) and Part B (20 marks) style questions

Part A (2 marks each)

Q1. Define joint PDF. A function f(x,y)0f(x, y) \ge 0 with f(x,y)dxdy=1\iint f(x, y) \, dx \, dy = 1, giving probabilities of regions via integration.

Q2. How is the marginal density fXf_X obtained? By integrating out the other variable: fX(x)=f(x,y)dyf_X(x) = \int f(x, y) \, dy.

Q3. State the independence condition for two continuous RVs. f(x,y)=fX(x)fY(y)f(x, y) = f_X(x) f_Y(y) for all x,yx, y.

Q4. Define covariance. Cov(X,Y)=E[(XE[X])(YE[Y])]=E[XY]E[X]E[Y]\text{Cov}(X, Y) = E[(X - E[X])(Y - E[Y])] = E[XY] - E[X]E[Y].


Part B (20 marks)

Q. Define joint, marginal and conditional distributions for two continuous random variables. State and discuss independence and covariance. For the joint PDF f(x,y)=6xyf(x, y) = 6xy on 0x1,0y1,x+y10 \le x \le 1, 0 \le y \le 1, x + y \le 1, find the marginals and check whether X,YX, Y are independent.

Definitions.

Joint PDF: f(x,y)0f(x, y) \ge 0 with f=1\iint f = 1. For a region AA: P((X,Y)A)=Af(x,y)dxdyP((X, Y) \in A) = \iint_A f(x, y) \, dx \, dy.

Marginals: fX(x)=f(x,y)dyf_X(x) = \int f(x, y) dy, fY(y)=f(x,y)dxf_Y(y) = \int f(x, y) dx.

Conditional: fXY(xy)=f(x,y)fY(y)f_{X|Y}(x|y) = \dfrac{f(x, y)}{f_Y(y)} when fY(y)>0f_Y(y) > 0.

Independence. XYX \perp Y iff f(x,y)=fX(x)fY(y)f(x, y) = f_X(x) f_Y(y) for all x,yx, y. Equivalently fXY(xy)=fX(x)f_{X|Y}(x|y) = f_X(x) — conditioning provides no extra information.

Covariance. Measures linear co-movement: Cov(X,Y)=E[(XE[X])(YE[Y])]=E[XY]E[X]E[Y].\text{Cov}(X, Y) = E[(X - E[X])(Y - E[Y])] = E[XY] - E[X]E[Y].

  • Cov>0\text{Cov} > 0: tend to increase together.
  • Cov<0\text{Cov} < 0: one tends to increase when the other decreases.
  • Independence ⇒ Cov = 0 (converse false in general).

Standardised version is correlation: ρ=CovσXσY[1,1]\rho = \dfrac{\text{Cov}}{\sigma_X \sigma_Y} \in [-1, 1].

Worked problem. Verify integration domain D={(x,y):0x,0y,x+y1}D = \{(x, y) : 0 \le x, 0 \le y, x + y \le 1\}.

Marginals. fX(x)=01x6xydy=6x(1x)22=3x(1x)2,0x1.f_X(x) = \int_0^{1-x} 6xy \, dy = 6x \cdot \frac{(1-x)^2}{2} = 3x(1-x)^2, \quad 0 \le x \le 1. By symmetry: fY(y)=3y(1y)2f_Y(y) = 3y(1-y)^2.

Independence check. fX(x)fY(y)=9xy(1x)2(1y)26xyf_X(x) \cdot f_Y(y) = 9 x y (1-x)^2 (1-y)^2 \ne 6xy in general.

So f(x,y)fX(x)fY(y)f(x, y) \ne f_X(x) f_Y(y)XX and YY are not independent.

(This is also evident from the support: the constraint x+y1x + y \le 1 couples XX and YY — independence would require a rectangular support.)