PGD01C02
Module 5 · Data Structures and Algorithm Design

Algorithm Analysis: Time and Space Complexity

Core Titles
Key headlines and terms for quick recall
  • Algorithm analysis — predict resource use without running
  • Time complexity — # of steps as function of input nn
  • Space complexity — memory used
  • Asymptotic notation: OO (upper), Ω\Omega (lower), Θ\Theta (tight)
  • Common classes: O(1)<O(logn)<O(n)<O(nlogn)<O(n2)<O(2n)<O(n!)O(1) < O(\log n) < O(n) < O(n \log n) < O(n^2) < O(2^n) < O(n!)
  • Best, Average, Worst cases
  • Amortised analysis
  • Master theorem for divide-and-conquer recurrences
Basic Idea
What it is, why it matters, how it works

Why analyse?

Two algorithms solving the same problem can differ by a factor of millions on real data. Algorithm analysis predicts resource use mathematically, without needing to run on every input.

We care about:

  • Time — how many basic operations.
  • Space — how much memory.

Asymptotic notation

We describe growth as nn \to \infty, ignoring constants and lower-order terms.

  • Big-O O(f(n))O(f(n)) — upper bound. T(n)cf(n)T(n) \le c \cdot f(n) for large nn.
  • Big-Omega Ω(f(n))\Omega(f(n)) — lower bound.
  • Big-Theta Θ(f(n))\Theta(f(n)) — tight bound (upper + lower).

We usually quote worst-case OO in interviews and texts.

Common growth classes

NotationNameExample
O(1)O(1)Constantarray index
O(logn)O(\log n)Logarithmicbinary search
O(n)O(n)Linearone pass
O(nlogn)O(n \log n)Linearithmicmerge sort, FFT
O(n2)O(n^2)Quadraticbubble sort
O(n3)O(n^3)Cubicnaive matrix multiplication
O(2n)O(2^n)Exponentialbrute-force subset enumeration
O(n!)O(n!)Factorialbrute-force permutations / TSP

For n=106n = 10^6: O(n2)O(n^2) is 101210^{12} ops — too slow for one second. O(nlogn)O(n \log n) is 2×107\sim 2 \times 10^7 — fine.

Three case analyses

  • Best case — luckiest input (rarely useful).
  • Average case — expected behaviour on random input.
  • Worst case — guaranteed upper bound (most useful for safety).

For quicksort: best O(nlogn)O(n \log n), average O(nlogn)O(n \log n), worst O(n2)O(n^2) (already-sorted).

Amortised analysis

Some operations are sometimes expensive but on average cheap. Amortised cost averages over a sequence:

  • Dynamic array (Python list) — append is O(1)O(1) amortised because doubling-and-copying happens rarely.

How to analyse

  1. Identify the "basic operation" (comparison, arithmetic).
  2. Count how many times it runs as a function of input size.
  3. Drop constants and lower-order terms.
def linear_search(arr, x):
    for i in range(len(arr)):       # n iterations
        if arr[i] == x:             # 1 op
            return i
    return -1
# Worst case T(n) = n → O(n)

Recurrences and the Master Theorem

For divide-and-conquer T(n)=aT(n/b)+f(n)T(n) = a T(n/b) + f(n):

  • If f(n)=O(nlogbaε)f(n) = O(n^{\log_b a - \varepsilon})T(n)=Θ(nlogba)T(n) = \Theta(n^{\log_b a}).
  • If f(n)=Θ(nlogba)f(n) = \Theta(n^{\log_b a})T(n)=Θ(nlogbalogn)T(n) = \Theta(n^{\log_b a} \log n).
  • If f(n)=Ω(nlogba+ε)f(n) = \Omega(n^{\log_b a + \varepsilon}) (and regularity) → T(n)=Θ(f(n))T(n) = \Theta(f(n)).

Merge sort: T(n)=2T(n/2)+O(n)T(n)=O(nlogn)T(n) = 2 T(n/2) + O(n) \Rightarrow T(n) = O(n \log n).

Space complexity

Sometimes the bottleneck. Recursive algorithms use stack space; sorting in-place vs out-of-place differs.

Why it matters in Data Science

  • Knowing O(n2)O(n^2) matrix ops scale poorly tells you to switch to mini-batches or specialised libraries.
  • ML training and inference latency depends on this analysis.
  • Choosing the right data structure (hash table vs list) flips O(n)O(n) to O(1)O(1).
Mind Map
Visual structure of the concept
ALGORITHM ANALYSIS
├── Time vs Space complexity
├── Asymptotic notation
│   ├── O — upper bound
│   ├── Ω — lower bound
│   └── Θ — tight
├── Growth classes
│   └── O(1) < log n < n < n log n < n² < 2ⁿ < n!
├── Three cases
│   ├── Best
│   ├── Average
│   └── Worst
├── Amortised analysis (e.g., dynamic array append)
├── Master theorem for T(n) = aT(n/b) + f(n)
└── Practical impact (n = 10⁶, op limits)
Exam Q&A
Part A (2 marks) and Part B (20 marks) style questions

Part A (2 marks each)

Q1. Define time and space complexity.

  • Time complexity — number of basic operations an algorithm performs as a function of input size nn.
  • Space complexity — amount of memory required as a function of nn, including input, auxiliary and output.

Q2. What is Big-O notation? A mathematical notation that describes the asymptotic upper bound on an algorithm's growth: T(n)=O(f(n))T(n) = O(f(n)) iff T(n)cf(n)T(n) \le c \cdot f(n) for some constant cc and all nn0n \ge n_0.

Q3. Order the following from slowest to fastest: O(n)O(n), O(logn)O(\log n), O(n2)O(n^2), O(nlogn)O(n \log n), O(1)O(1). O(1)<O(logn)<O(n)<O(nlogn)<O(n2)O(1) < O(\log n) < O(n) < O(n \log n) < O(n^2).


Part B (20 marks)

Q. Explain algorithm analysis. Discuss time and space complexity, asymptotic notations and common complexity classes. Why is the analysis of algorithms important?

Why analyse?

Two algorithms solving the same problem can differ by a factor of millions on real input. Empirical timing is brittle — performance changes with machine, compiler, data. Asymptotic analysis predicts growth mathematically and lets us reason about scalability.

Time complexity — number of basic operations as a function of input size nn. Space complexity — memory required as a function of nn.

Asymptotic notation.

NotationMeaningUse
O(f)O(f)T(n)cf(n)T(n) \le c f(n) — upper boundWorst-case guarantees
Ω(f)\Omega(f)T(n)cf(n)T(n) \ge c f(n) — lower boundBest-case / lower-bound proofs
Θ(f)\Theta(f)Both — tight boundExact growth class

We drop constants and lower-order terms — 5n2+3n+75 n^2 + 3 n + 7 is O(n2)O(n^2).

Common complexity classes.

ClassNameExample
O(1)O(1)ConstantArray index, hash lookup
O(logn)O(\log n)LogarithmicBinary search
O(n)O(n)LinearSingle pass through array
O(nlogn)O(n \log n)LinearithmicMerge sort, heap sort, FFT
O(n2)O(n^2)QuadraticBubble / selection sort
O(n3)O(n^3)CubicNaive matrix multiply
O(2n)O(2^n)ExponentialBrute-force subset enumeration
O(n!)O(n!)FactorialBrute-force TSP

Concrete impact at n=106n = 10^6 (assume 10910^9 ops/sec):

ClassOperationsTime
O(n)O(n)10610^61 ms
O(nlogn)O(n \log n)2×1072 \times 10^720 ms
O(n2)O(n^2)101210^{12}~17 min
O(2n)O(2^n)astronomicaluniverse-age

Three cases.

  • Best case — luckiest input.
  • Average case — expected over random input.
  • Worst case — guaranteed upper bound (used in safety-critical systems).

Quicksort: best O(nlogn)O(n \log n), average O(nlogn)O(n \log n), worst O(n2)O(n^2) (already-sorted, bad pivot).

Amortised analysis.

Sometimes individual operations are expensive but on average cheap. Dynamic array append (Python list, C++ vector) is O(1)O(1) amortised — though resizing copies the whole array, it happens rarely.

Recurrences (Master Theorem). For divide-and-conquer T(n)=aT(n/b)+f(n)T(n) = a T(n/b) + f(n):

CaseConditionSolution
1f(n)=O(nlogbaε)f(n) = O(n^{\log_b a - \varepsilon})T(n)=Θ(nlogba)T(n) = \Theta(n^{\log_b a})
2f(n)=Θ(nlogba)f(n) = \Theta(n^{\log_b a})T(n)=Θ(nlogbalogn)T(n) = \Theta(n^{\log_b a} \log n)
3f(n)=Ω(nlogba+ε)f(n) = \Omega(n^{\log_b a + \varepsilon}) + regularityT(n)=Θ(f(n))T(n) = \Theta(f(n))

Example (merge sort): T(n)=2T(n/2)+O(n)T(n)=O(nlogn)T(n) = 2 T(n/2) + O(n) \Rightarrow T(n) = O(n \log n) (Case 2).

Why this matters.

  • Predict whether an algorithm will scale to production data.
  • Compare candidate algorithms without coding both.
  • Justify infrastructure decisions — "your O(n2)O(n^2) algorithm will take 15 hours on 10⁶ rows; switch to O(nlogn)O(n \log n)."
  • In ML, the same idea drives matrix-multiplication cost (O(n3)O(n^3) generally) and explains why batching, GPUs and approximate algorithms (e.g., O(n)O(n) random projections) are crucial.

Worked example.

Sorting 10 million numbers:

  • Bubble sort O(n2)O(n^2) = 101410^{14} ops — infeasible.
  • Merge sort O(nlogn)2.3×108O(n \log n) \approx 2.3 \times 10^8 ops — 0.5 s.

The choice of algorithm is the difference between completing and timing out.