Matrix Factorizations
Core Titles
Key headlines and terms for quick recall- LU decomposition —
- QR decomposition — , orthogonal, upper triangular
- Cholesky — for symmetric positive-definite
- Eigendecomposition —
- Singular Value Decomposition (SVD) —
Basic Idea
What it is, why it matters, how it worksWhy factorize?
A factorization writes a matrix as a product of simpler matrices. This makes solving systems, inverting, and computing eigenvalues much faster and more numerically stable.
LU decomposition
lower triangular with unit diagonal, upper triangular. Used to solve in two triangular sweeps: , then . Equivalent to Gaussian elimination. Works for any non-singular square matrix (with row pivoting if needed: ).
QR decomposition
has orthonormal columns (), is upper triangular. Computed by Gram–Schmidt or Householder reflections. Used in:
- Least squares:
- QR algorithm for eigenvalues
Cholesky decomposition
For symmetric positive-definite : lower triangular with positive diagonal. Twice as fast as LU. Used in covariance matrices, Gaussian processes, simulating multivariate normals.
Eigendecomposition
has eigenvectors, diagonal of eigenvalues. Requires to have a full set of linearly independent eigenvectors. For symmetric : , orthogonal.
Singular Value Decomposition (SVD)
where for :
- is orthogonal (left singular vectors)
- is diagonal of non-negative singular values
- is orthogonal (right singular vectors)
SVD always exists. Singular values are .
Why this matters in Data Science
- PCA = eigendecomposition of covariance, or equivalently SVD of centred data
- Latent Semantic Analysis uses truncated SVD of term-document matrices
- Recommender systems factor the rating matrix
- Image compression by keeping only the top singular values
- Solving least squares via QR
Mind Map
Visual structure of the conceptMATRIX FACTORIZATIONS
├── LU: A = LU (Gauss elim, solve Ax = b)
├── QR: A = QR (least squares, eigenvalues)
├── Cholesky: A = LLᵀ (symmetric pos-def)
├── Eigen: A = PDP⁻¹ (eigenvalues / vectors)
│ └── If A=Aᵀ ⇒ A = QΛQᵀ
└── SVD: A = UΣVᵀ
├── Always exists
├── σᵢ = √(eig(AᵀA))
└── Truncated SVD = low-rank approximation
Exam Q&A
Part A (2 marks) and Part B (20 marks) style questionsPart A (2 marks each)
Q1. Write the LU decomposition idea. where is lower triangular and upper triangular; used to solve via two triangular systems.
Q2. What is a QR decomposition? where has orthonormal columns and is upper triangular.
Q3. When is Cholesky factorization possible? When is symmetric and positive-definite. Then .
Q4. State the SVD. Any matrix can be written as with orthogonal and diagonal of non-negative singular values.
Part B (20 marks)
Q. Explain Singular Value Decomposition (SVD). State its existence theorem and discuss how it generalises eigendecomposition. Mention its applications in data science.
Definition. For any matrix , the SVD is where:
- is orthogonal (). Its columns are left singular vectors.
- is orthogonal. Its columns are right singular vectors.
- is with on the diagonal (the singular values) and zeros elsewhere. = rank of .
Existence. SVD exists for every real matrix (no assumptions of squareness or invertibility). Singular values are square roots of eigenvalues of (or ); the right singular vectors are the eigenvectors of and left singular vectors are eigenvectors of .
Generalisation of eigendecomposition.
- Eigendecomposition requires to be square and have a full set of eigenvectors. Many matrices don't.
- SVD applies to any matrix.
- For a symmetric positive-definite , SVD and eigendecomposition coincide: , .
Low-rank approximation (Eckart–Young). Keep the top singular triplets: is the best rank- approximation to in Frobenius and spectral norms.
Applications in Data Science.
- Principal Component Analysis (PCA) — SVD of centred data; right singular vectors are PCs, singular values give variances.
- Latent Semantic Analysis (LSA) — truncated SVD of term-document matrices to find latent topics.
- Recommender systems — matrix-factorise the user–item rating matrix.
- Image compression — keep top singular triplets; reconstruct an approximate image with far fewer numbers.
- Pseudo-inverse — , used to solve overdetermined least-squares problems.
- Noise reduction — small singular values often correspond to noise; dropping them denoises.
SVD is sometimes called the most important factorization in numerical linear algebra.