PGDDSA Study · Semester 1

Core Titles

Key headlines and terms for quick recall

Target density $f(x)$ , Proposal density $g(x)$
Envelope constant $M$ with $M g(x) \ge f(x)$ for all $x$
Accept with probability $f(x)/(M g(x))$
Acceptance rate $= 1/M$
Pros / Cons vs inverse transform, MCMC

Basic Idea

What it is, why it matters, how it works

Motivation

Inverse-transform sampling needs $F^{-1}$ in closed form — often impossible. Rejection sampling lets us sample from $f$ as long as we can evaluate it (up to a constant) and find a good proposal $g$ that we can sample from.

Algorithm

Given:

target density $f(x)$ we want to sample from,
a proposal density $g(x)$ we can sample from,
a constant $M$ with $M g(x) \ge f(x)$ for all $x$ .

Steps.

Sample $X \sim g$ .
Sample $U \sim \text{Uniform}(0, 1)$ .
If $U \le \dfrac{f(X)}{M g(X)}$ , accept $X$ . Else reject and go to step 1.

Why it works

The accepted samples have the target distribution $f$ . Proof: $P(\text{accept and } X \le x) = \int_{-\infty}^x g(y) \cdot \dfrac{f(y)}{M g(y)} \, dy = \dfrac{1}{M} \int_{-\infty}^x f(y) \, dy = \dfrac{F(x)}{M}$ .

Marginal: $P(\text{accept}) = 1/M$ .

Conditional CDF of accepted samples: $P(X \le x | \text{accept}) = \dfrac{F(x)/M}{1/M} = F(x)$ .

So accepted samples follow $f$ .

Acceptance rate

$P(\text{accept}) = \frac{1}{M}$ We want $M$ as small as possible (i.e., $g$ should "hug" $f$ tightly). Bad choice of $g$ ⇒ many rejections ⇒ slow.

Practical tip

Choose $g$ in the same family as $f$ but easy to sample (e.g., normal, exponential, uniform).

Why this matters in Data Science

Backbone of Monte Carlo methods. Used when target density is awkward but proposals are easy: Bayesian inference (when MCMC is overkill), simulation, generative modelling.

Mind Map

Visual structure of the concept

REJECTION SAMPLING
├── Target f, Proposal g, M  (Mg ≥ f)
├── Algorithm
│   ├── Draw X ~ g
│   ├── Draw U ~ Uniform(0,1)
│   ├── Accept if U ≤ f(X)/(M g(X))
│   └── Else reject, repeat
├── Accepted ~ f  (proof via CDF)
├── Acceptance rate = 1/M
└── Goal: small M ⇒ g hugs f tightly

Exam Q&A

Part A (2 marks) and Part B (20 marks) style questions

Part A (2 marks each)

Q1. Why rejection sampling? When the target CDF cannot be inverted in closed form but the density $f$ can be evaluated.

Q2. State the rejection condition. Accept $X \sim g$ if $U \le \dfrac{f(X)}{M g(X)}$ , where $U \sim \text{Uniform}(0, 1)$ .

Q3. What is the acceptance probability of rejection sampling? $1/M$ .

Q4. What governs efficiency of rejection sampling? The tightness of the envelope $M g$ over $f$ ; smaller $M$ is more efficient.

Part B (20 marks)

Q. Describe the rejection sampling algorithm. Prove that the accepted samples follow the target distribution. Use rejection sampling to sample from the truncated normal on $[0, 1]$ using a uniform proposal.

Algorithm.

Target density $f(x)$ , support $S$ .
Proposal density $g(x)$ on $S$ from which we can sample.
Constant $M \ge \sup_x f(x)/g(x)$ so that $M g(x) \ge f(x)$ for all $x$ .

loop:
    X ~ g
    U ~ Uniform(0, 1)
    if U ≤ f(X) / (M g(X)):
        return X

Proof. Let $A$ denote the event "accept on this iteration".

$P(A, X \le x) = \int_{-\infty}^x g(y) \cdot P(U \le f(y)/(M g(y))) \, dy = \int_{-\infty}^x g(y) \cdot \frac{f(y)}{M g(y)} \, dy = \frac{1}{M} \int_{-\infty}^x f(y) \, dy = \frac{F(x)}{M}$ .

Marginal acceptance: $P(A) = F(\infty)/M = 1/M$ .

Conditional on acceptance: $P(X \le x \mid A) = \frac{F(x)/M}{1/M} = F(x).$

So the accepted $X$ has CDF $F$ — i.e., distribution $f$ . ∎

Truncated normal on $[0, 1]$ .

The truncated standard normal density on $[0,1]$ is $f(x) = \frac{\phi(x)}{\Phi(1) - \Phi(0)} = \frac{\phi(x)}{0.3413}, \quad 0 \le x \le 1,$ where $\phi(x) = (1/\sqrt{2\pi}) e^{-x^2/2}$ .

Proposal: $g(x) = 1$ on $[0, 1]$ (uniform).

Envelope. $f(x)$ is maximised at $x = 0$ : $f(0) = \dfrac{1/\sqrt{2\pi}}{0.3413} \approx 1.169$ .

So take $M = 1.17$ .

Algorithm.

$X \sim \text{Uniform}(0, 1)$ .
$U \sim \text{Uniform}(0, 1)$ .
Accept $X$ if $U \le f(X) / 1.17$ .

Acceptance rate $\approx 1/1.17 \approx 85\%$ — efficient. A poorly-chosen $g$ (e.g., wide normal) would give a much higher $M$ and many rejections.

Rejection Sampling