Algorithm Analysis and Big O

Introduction

Algorithm analysis lets us compare algorithms objectively, independent of hardware or implementation details. Big O notation provides a language for discussing efficiency that every programmer should understand.

Learning Objectives

By the end of this reading, you will be able to:

  • Analyze time and space complexity
  • Use Big O, Big Omega, and Big Theta notation
  • Identify common complexity classes
  • Analyze code to determine its complexity
  • Make informed decisions about algorithm tradeoffs

1. Why Analyze Algorithms?

The Problem with Timing

import time

start = time.time()
result = my_algorithm(data)
end = time.time()
print(f"Took {end - start} seconds")

Problems:

  • Results vary by machine, language, load
  • Can't compare algorithms on different inputs
  • Doesn't predict behavior on larger inputs

What We Really Want

  • Machine-independent comparison
  • Growth rate as input size increases
  • Worst-case guarantees (usually)

2. Counting Operations

Instead of time, count fundamental operations.

def sum_array(arr):
    total = 0                  # 1 assignment
    for x in arr:              # n iterations
        total += x             # n additions + n assignments
    return total               # 1 return

# Total: 1 + 2n + 1 = 2n + 2 operations

But we don't care about exact counts. We care about growth rate.


3. Big O Notation

Big O describes an upper bound on growth rate.

Definition: f(n) = O(g(n)) means there exist constants c > 0 and n₀ such that: f(n) ≤ c · g(n) for all n ≥ n₀

In plain English: f grows no faster than g (asymptotically).

Examples

2n + 2 = O(n)     (ignore constants and lower-order terms)
3n² + 5n + 10 = O(n²)
n + log n = O(n)
5 = O(1)

Simplification Rules

  1. Drop constants: O(2n) → O(n)
  2. Drop lower-order terms: O(n² + n) → O(n²)
  3. Focus on dominant term: O(n + log n) → O(n)

4. Common Complexity Classes

Listed from fastest to slowest growth:

NotationNameExample
O(1)ConstantArray access
O(log n)LogarithmicBinary search
O(n)LinearLinear search
O(n log n)LinearithmicMerge sort
O(n²)QuadraticBubble sort
O(n³)CubicNaive matrix multiply
O(2ⁿ)ExponentialSubset generation
O(n!)FactorialPermutation generation

Growth Comparison

For n = 1,000,000:

ComplexityOperations
O(1)1
O(log n)~20
O(n)1,000,000
O(n log n)~20,000,000
O(n²)1,000,000,000,000

5. Analyzing Code

Constant Time: O(1)

def get_first(arr):
    return arr[0]  # O(1)

def swap(arr, i, j):
    arr[i], arr[j] = arr[j], arr[i]  # O(1)

Linear Time: O(n)

def find_max(arr):
    max_val = arr[0]
    for x in arr:          # O(n)
        if x > max_val:
            max_val = x
    return max_val

Quadratic Time: O(n²)

def has_duplicates(arr):
    n = len(arr)
    for i in range(n):           # O(n)
        for j in range(i + 1, n):  # O(n)
            if arr[i] == arr[j]:
                return True
    return False
# Total: O(n²)

Logarithmic Time: O(log n)

def binary_search(arr, target):
    left, right = 0, len(arr) - 1

    while left <= right:         # O(log n) iterations
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1

    return -1

Why log n? Each iteration halves the search space: n → n/2 → n/4 → ... → 1

Number of halvings: log₂(n)

Linearithmic Time: O(n log n)

def merge_sort(arr):
    if len(arr) <= 1:
        return arr

    mid = len(arr) // 2
    left = merge_sort(arr[:mid])    # T(n/2)
    right = merge_sort(arr[mid:])   # T(n/2)
    return merge(left, right)       # O(n)

# T(n) = 2T(n/2) + O(n) = O(n log n)

Exponential Time: O(2ⁿ)

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)  # Two recursive calls

# Each call branches into two: O(2ⁿ)

6. Analyzing Loops

Sequential Loops: Add Complexities

for i in range(n):      # O(n)
    process(i)

for i in range(n):      # O(n)
    process(i)

# Total: O(n) + O(n) = O(n)

Nested Loops: Multiply Complexities

for i in range(n):          # O(n)
    for j in range(n):      # O(n)
        process(i, j)

# Total: O(n) × O(n) = O(n²)

Dependent Nested Loops

for i in range(n):
    for j in range(i):      # j goes from 0 to i-1
        process(i, j)

# Iterations: 0 + 1 + 2 + ... + (n-1) = n(n-1)/2 = O(n²)

Loops with Different Bounds

for i in range(n):          # O(n)
    for j in range(m):      # O(m)
        process(i, j)

# Total: O(n × m)

Logarithmic Loops

i = n
while i > 0:
    process(i)
    i = i // 2  # Halving

# Iterations: log₂(n) = O(log n)

7. Space Complexity

Memory usage as a function of input size.

Examples

# O(1) space - constant extra memory
def sum_array(arr):
    total = 0
    for x in arr:
        total += x
    return total

# O(n) space - linear extra memory
def double_array(arr):
    result = []
    for x in arr:
        result.append(x * 2)
    return result

# O(n) space - recursion uses stack
def factorial(n):
    if n <= 1:
        return 1
    return n * factorial(n - 1)  # n stack frames

In-Place Algorithms

Use O(1) extra space (modifying input is allowed):

def reverse_in_place(arr):
    left, right = 0, len(arr) - 1
    while left < right:
        arr[left], arr[right] = arr[right], arr[left]
        left += 1
        right -= 1
    # O(1) extra space

8. Other Asymptotic Notations

Big Omega (Ω) - Lower Bound

f(n) = Ω(g(n)) means f grows at least as fast as g.

Example: Any comparison-based sort is Ω(n log n)

Big Theta (Θ) - Tight Bound

f(n) = Θ(g(n)) means f grows at the same rate as g.

Example: Merge sort is Θ(n log n) - both upper and lower bound

Relationships

  • f(n) = Θ(g(n)) ⟺ f(n) = O(g(n)) AND f(n) = Ω(g(n))
  • Big O is most commonly used (worst-case focus)

9. Best, Average, Worst Case

Quick Sort Example

def quicksort(arr):
    # Best case: Pivot always splits evenly
    #   T(n) = 2T(n/2) + O(n) = O(n log n)

    # Worst case: Pivot always smallest/largest
    #   T(n) = T(n-1) + O(n) = O(n²)

    # Average case: Random pivot
    #   O(n log n)
    pass

Linear Search Example

Best case:  O(1)   - Target is first element
Worst case: O(n)   - Target is last or not present
Average case: O(n) - Target is in middle on average

We typically report worst case unless specified otherwise.


10. Amortized Analysis

Average time per operation over a sequence of operations.

Dynamic Array Example

# Individual append: O(n) worst case (resize)
# But most appends are O(1)

# Amortized analysis:
# n appends cost: n × O(1) + sum of resize costs
# Resize costs: 1 + 2 + 4 + ... + n/2 ≈ n
# Total: O(n) + O(n) = O(n)
# Amortized per append: O(n) / n = O(1)

11. Common Patterns

Pattern: Look at Growth

# O(1): Fixed number of operations
x = arr[5]

# O(log n): Halving/doubling each step
while n > 0:
    n //= 2

# O(n): Single pass through data
for x in arr:
    process(x)

# O(n log n): Divide and conquer with linear merge
merge_sort(arr)

# O(n²): Nested iteration over data
for i in range(n):
    for j in range(n):
        process(i, j)

# O(2ⁿ): Generate all subsets
for subset in all_subsets(arr):
    process(subset)

Pattern: Recursion

# T(n) = T(n-1) + O(1) → O(n)
def countdown(n):
    if n == 0: return
    countdown(n - 1)

# T(n) = T(n-1) + O(n) → O(n²)
def remove_each(arr):
    if len(arr) == 0: return
    process_all(arr)  # O(n)
    remove_each(arr[1:])

# T(n) = 2T(n/2) + O(n) → O(n log n)
def merge_sort(arr):
    if len(arr) <= 1: return arr
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

# T(n) = 2T(n/2) + O(1) → O(n)
def count_nodes(tree):
    if tree is None: return 0
    return 1 + count_nodes(tree.left) + count_nodes(tree.right)

12. Practical Considerations

Real-World Factors

Big O ignores:

  • Constants: 100n vs 2n both O(n), but 100n is slower
  • Cache effects: Array access is faster than linked list (locality)
  • Small inputs: O(n²) might beat O(n log n) for small n
  • Hidden costs: Memory allocation, garbage collection

Rules of Thumb

ComplexityMax n for ~1 second
O(n!)~10
O(2ⁿ)~20-25
O(n³)~500
O(n²)~10,000
O(n log n)~10,000,000
O(n)~100,000,000
O(log n)Any

Exercises

Basic

  1. What is the time complexity?
for i in range(n):
    for j in range(10):
        print(i, j)
  1. What is the time complexity?
i = n
while i > 0:
    print(i)
    i = i // 3
  1. What is the space complexity?
def mystery(n):
    if n <= 0:
        return 0
    return mystery(n - 1) + n

Intermediate

  1. Analyze time and space:
def process(arr):
    n = len(arr)
    result = []
    for i in range(n):
        for j in range(i, n):
            result.append(arr[i] + arr[j])
    return result
  1. Prove: 3n² + 5n + 10 = O(n²)

  2. What's the complexity of this recursive function?

def f(n):
    if n <= 1:
        return 1
    return f(n // 2) + f(n // 2)

Advanced

  1. Analyze the amortized complexity of n operations on a stack that doubles its capacity when full.

  2. Given a function with complexity O(n² log n), how much longer will it take if we double the input size?

  3. Design an algorithm to find if an array has two elements that sum to zero. Analyze solutions with different time/space tradeoffs.


Summary

  • Big O describes growth rate, ignoring constants and lower terms
  • Common classes: O(1) < O(log n) < O(n) < O(n log n) < O(n²) < O(2ⁿ)
  • Nested loops multiply; sequential loops add
  • Space complexity counts extra memory used
  • Amortized analysis averages over sequences of operations
  • Real-world performance depends on constants, cache, and actual input

Next Reading

Searching Algorithms →