PGD01C01
Module 5 · Probability Theory

Rejection Sampling

Core Titles
Key headlines and terms for quick recall
  • Target density f(x)f(x), Proposal density g(x)g(x)
  • Envelope constant MM with Mg(x)f(x)M g(x) \ge f(x) for all xx
  • Accept with probability f(x)/(Mg(x))f(x)/(M g(x))
  • Acceptance rate =1/M= 1/M
  • Pros / Cons vs inverse transform, MCMC
Basic Idea
What it is, why it matters, how it works

Motivation

Inverse-transform sampling needs F1F^{-1} in closed form — often impossible. Rejection sampling lets us sample from ff as long as we can evaluate it (up to a constant) and find a good proposal gg that we can sample from.

Algorithm

Given:

  • target density f(x)f(x) we want to sample from,
  • a proposal density g(x)g(x) we can sample from,
  • a constant MM with Mg(x)f(x)M g(x) \ge f(x) for all xx.

Steps.

  1. Sample XgX \sim g.
  2. Sample UUniform(0,1)U \sim \text{Uniform}(0, 1).
  3. If Uf(X)Mg(X)U \le \dfrac{f(X)}{M g(X)}, accept XX. Else reject and go to step 1.

Why it works

The accepted samples have the target distribution ff. Proof: P(accept and Xx)=xg(y)f(y)Mg(y)dy=1Mxf(y)dy=F(x)MP(\text{accept and } X \le x) = \int_{-\infty}^x g(y) \cdot \dfrac{f(y)}{M g(y)} \, dy = \dfrac{1}{M} \int_{-\infty}^x f(y) \, dy = \dfrac{F(x)}{M}.

Marginal: P(accept)=1/MP(\text{accept}) = 1/M.

Conditional CDF of accepted samples: P(Xxaccept)=F(x)/M1/M=F(x)P(X \le x | \text{accept}) = \dfrac{F(x)/M}{1/M} = F(x).

So accepted samples follow ff.

Acceptance rate

P(accept)=1MP(\text{accept}) = \frac{1}{M} We want MM as small as possible (i.e., gg should "hug" ff tightly). Bad choice of gg ⇒ many rejections ⇒ slow.

Practical tip

Choose gg in the same family as ff but easy to sample (e.g., normal, exponential, uniform).

Why this matters in Data Science

Backbone of Monte Carlo methods. Used when target density is awkward but proposals are easy: Bayesian inference (when MCMC is overkill), simulation, generative modelling.

Mind Map
Visual structure of the concept
REJECTION SAMPLING
├── Target f, Proposal g, M  (Mg ≥ f)
├── Algorithm
│   ├── Draw X ~ g
│   ├── Draw U ~ Uniform(0,1)
│   ├── Accept if U ≤ f(X)/(M g(X))
│   └── Else reject, repeat
├── Accepted ~ f  (proof via CDF)
├── Acceptance rate = 1/M
└── Goal: small M ⇒ g hugs f tightly
Exam Q&A
Part A (2 marks) and Part B (20 marks) style questions

Part A (2 marks each)

Q1. Why rejection sampling? When the target CDF cannot be inverted in closed form but the density ff can be evaluated.

Q2. State the rejection condition. Accept XgX \sim g if Uf(X)Mg(X)U \le \dfrac{f(X)}{M g(X)}, where UUniform(0,1)U \sim \text{Uniform}(0, 1).

Q3. What is the acceptance probability of rejection sampling? 1/M1/M.

Q4. What governs efficiency of rejection sampling? The tightness of the envelope MgM g over ff; smaller MM is more efficient.


Part B (20 marks)

Q. Describe the rejection sampling algorithm. Prove that the accepted samples follow the target distribution. Use rejection sampling to sample from the truncated normal on [0,1][0, 1] using a uniform proposal.

Algorithm.

  • Target density f(x)f(x), support SS.
  • Proposal density g(x)g(x) on SS from which we can sample.
  • Constant Msupxf(x)/g(x)M \ge \sup_x f(x)/g(x) so that Mg(x)f(x)M g(x) \ge f(x) for all xx.
loop:
    X ~ g
    U ~ Uniform(0, 1)
    if U ≤ f(X) / (M g(X)):
        return X

Proof. Let AA denote the event "accept on this iteration".

P(A,Xx)=xg(y)P(Uf(y)/(Mg(y)))dy=xg(y)f(y)Mg(y)dy=1Mxf(y)dy=F(x)MP(A, X \le x) = \int_{-\infty}^x g(y) \cdot P(U \le f(y)/(M g(y))) \, dy = \int_{-\infty}^x g(y) \cdot \frac{f(y)}{M g(y)} \, dy = \frac{1}{M} \int_{-\infty}^x f(y) \, dy = \frac{F(x)}{M}.

Marginal acceptance: P(A)=F()/M=1/MP(A) = F(\infty)/M = 1/M.

Conditional on acceptance: P(XxA)=F(x)/M1/M=F(x).P(X \le x \mid A) = \frac{F(x)/M}{1/M} = F(x).

So the accepted XX has CDF FF — i.e., distribution ff. ∎

Truncated normal on [0,1][0, 1].

The truncated standard normal density on [0,1][0,1] is f(x)=ϕ(x)Φ(1)Φ(0)=ϕ(x)0.3413,0x1,f(x) = \frac{\phi(x)}{\Phi(1) - \Phi(0)} = \frac{\phi(x)}{0.3413}, \quad 0 \le x \le 1, where ϕ(x)=(1/2π)ex2/2\phi(x) = (1/\sqrt{2\pi}) e^{-x^2/2}.

Proposal: g(x)=1g(x) = 1 on [0,1][0, 1] (uniform).

Envelope. f(x)f(x) is maximised at x=0x = 0: f(0)=1/2π0.34131.169f(0) = \dfrac{1/\sqrt{2\pi}}{0.3413} \approx 1.169.

So take M=1.17M = 1.17.

Algorithm.

  1. XUniform(0,1)X \sim \text{Uniform}(0, 1).
  2. UUniform(0,1)U \sim \text{Uniform}(0, 1).
  3. Accept XX if Uf(X)/1.17U \le f(X) / 1.17.

Acceptance rate 1/1.1785%\approx 1/1.17 \approx 85\% — efficient. A poorly-chosen gg (e.g., wide normal) would give a much higher MM and many rejections.