Linear Systems and Gaussian Elimination

Introduction

Solving systems of linear equations is one of the most important applications of linear algebra. From circuit analysis to computer graphics to machine learning, linear systems appear everywhere. This reading covers methods for solving Ax = b.

Learning Objectives

By the end of this reading, you will be able to:

  • Represent linear systems in matrix form
  • Apply Gaussian elimination with partial pivoting
  • Understand LU decomposition
  • Analyze numerical stability
  • Recognize special cases (singular, overdetermined, underdetermined systems)

1. Linear Systems

Matrix Form

A system of linear equations:

a₁₁x₁ + a₁₂x₂ + ... + a₁ₙxₙ = b₁
a₂₁x₁ + a₂₂x₂ + ... + a₂ₙxₙ = b₂
...
aₘ₁x₁ + aₘ₂x₂ + ... + aₘₙxₙ = bₘ

Can be written as Ax = b where:

  • A is m×n coefficient matrix
  • x is n×1 unknown vector
  • b is m×1 constant vector
def print_system(A, b):
    """Display system of equations"""
    n = A.cols
    variables = [f"x{i+1}" for i in range(n)]

    for i in range(A.rows):
        terms = []
        for j in range(A.cols):
            coef = A[i,j]
            if coef != 0:
                if coef == 1:
                    terms.append(f"{variables[j]}")
                elif coef == -1:
                    terms.append(f"-{variables[j]}")
                else:
                    terms.append(f"{coef}{variables[j]}")
        equation = " + ".join(terms).replace("+ -", "- ")
        print(f"{equation} = {b[i]}")

# Example
A = Matrix([
    [2, 1, -1],
    [-3, -1, 2],
    [-2, 1, 2]
])
b = [8, -11, -3]
print_system(A, b)
# 2x1 + x2 - x3 = 8
# -3x1 - x2 + 2x3 = -11
# -2x1 + x2 + 2x3 = -3

2. Gaussian Elimination

Basic Algorithm

Transform to row echelon form using elementary row operations:

  1. Swap rows
  2. Multiply row by non-zero scalar
  3. Add multiple of one row to another
def gaussian_elimination(A, b):
    """
    Solve Ax = b using Gaussian elimination with back substitution

    Returns solution vector x or None if no unique solution
    """
    n = A.rows

    # Create augmented matrix [A|b]
    augmented = [[A[i,j] for j in range(A.cols)] + [b[i]] for i in range(n)]

    # Forward elimination
    for col in range(n):
        # Find pivot
        max_row = col
        for row in range(col + 1, n):
            if abs(augmented[row][col]) > abs(augmented[max_row][col]):
                max_row = row

        # Swap rows
        augmented[col], augmented[max_row] = augmented[max_row], augmented[col]

        # Check for zero pivot
        if abs(augmented[col][col]) < 1e-10:
            return None  # Singular or no unique solution

        # Eliminate below pivot
        for row in range(col + 1, n):
            factor = augmented[row][col] / augmented[col][col]
            for j in range(col, n + 1):
                augmented[row][j] -= factor * augmented[col][j]

    # Back substitution
    x = [0] * n
    for i in range(n - 1, -1, -1):
        x[i] = augmented[i][n]
        for j in range(i + 1, n):
            x[i] -= augmented[i][j] * x[j]
        x[i] /= augmented[i][i]

    return Vector(x)

# Solve the example system
A = Matrix([
    [2, 1, -1],
    [-3, -1, 2],
    [-2, 1, 2]
])
b = [8, -11, -3]
x = gaussian_elimination(A, b)
print(f"Solution: {x}")  # Should be approximately [2, 3, -1]

# Verify
result = A.matvec(x)
print(f"Verification Ax = {result}")  # Should equal b

Partial Pivoting

Always swap to bring largest absolute value to pivot position:

def gaussian_elimination_partial_pivot(A, b):
    """
    Gaussian elimination with partial pivoting for numerical stability
    """
    n = A.rows
    augmented = [[A[i,j] for j in range(A.cols)] + [b[i]] for i in range(n)]

    for col in range(n):
        # Partial pivoting: find row with largest absolute value in column
        max_row = col
        max_val = abs(augmented[col][col])

        for row in range(col + 1, n):
            if abs(augmented[row][col]) > max_val:
                max_val = abs(augmented[row][col])
                max_row = row

        if max_val < 1e-10:
            raise ValueError("Matrix is singular")

        # Swap rows
        augmented[col], augmented[max_row] = augmented[max_row], augmented[col]

        # Eliminate
        pivot = augmented[col][col]
        for row in range(col + 1, n):
            factor = augmented[row][col] / pivot
            augmented[row][col] = 0  # Exactly zero
            for j in range(col + 1, n + 1):
                augmented[row][j] -= factor * augmented[col][j]

    # Back substitution
    x = [0.0] * n
    for i in range(n - 1, -1, -1):
        x[i] = augmented[i][n]
        for j in range(i + 1, n):
            x[i] -= augmented[i][j] * x[j]
        x[i] /= augmented[i][i]

    return Vector(x)

3. LU Decomposition

Concept

Factor A = LU where:

  • L is lower triangular (1s on diagonal)
  • U is upper triangular

Then solve Ax = b as:

  1. Ly = b (forward substitution)
  2. Ux = y (back substitution)
def lu_decomposition(A):
    """
    LU decomposition without pivoting

    Returns L, U such that A = LU
    """
    n = A.rows

    L = Matrix.identity(n)
    U = Matrix([[A[i,j] for j in range(n)] for i in range(n)])

    for col in range(n):
        if abs(U[col, col]) < 1e-10:
            raise ValueError("Zero pivot encountered")

        for row in range(col + 1, n):
            factor = U[row, col] / U[col, col]
            L[row, col] = factor

            for j in range(col, n):
                U[row, j] -= factor * U[col, j]

    return L, U

def solve_lu(L, U, b):
    """Solve Ax = b given A = LU"""
    n = L.rows

    # Forward substitution: Ly = b
    y = [0.0] * n
    for i in range(n):
        y[i] = b[i]
        for j in range(i):
            y[i] -= L[i, j] * y[j]
        # L has 1s on diagonal

    # Back substitution: Ux = y
    x = [0.0] * n
    for i in range(n - 1, -1, -1):
        x[i] = y[i]
        for j in range(i + 1, n):
            x[i] -= U[i, j] * x[j]
        x[i] /= U[i, i]

    return Vector(x)

# Example
A = Matrix([
    [2, 1, 1],
    [4, 3, 3],
    [8, 7, 9]
])
L, U = lu_decomposition(A)
print(f"L =\n{L}")
print(f"U =\n{U}")

# Verify A = LU
LU = L @ U
print(f"LU =\n{LU}")

# Solve system
b = [4, 10, 24]
x = solve_lu(L, U, b)
print(f"Solution: {x}")

LU with Partial Pivoting (PLU)

For numerical stability, compute PA = LU where P is a permutation matrix.

def plu_decomposition(A):
    """
    LU decomposition with partial pivoting

    Returns P, L, U such that PA = LU
    """
    n = A.rows

    # Work with copies
    U = Matrix([[A[i,j] for j in range(n)] for i in range(n)])
    L = Matrix.zeros(n, n)
    P = Matrix.identity(n)

    for col in range(n):
        # Find pivot
        max_row = col
        max_val = abs(U[col, col])
        for row in range(col + 1, n):
            if abs(U[row, col]) > max_val:
                max_val = abs(U[row, col])
                max_row = row

        if max_val < 1e-10:
            raise ValueError("Matrix is singular")

        # Swap rows in U, P, and L
        if max_row != col:
            # Swap U rows
            for j in range(n):
                U[col, j], U[max_row, j] = U[max_row, j], U[col, j]
            # Swap P rows
            for j in range(n):
                P[col, j], P[max_row, j] = P[max_row, j], P[col, j]
            # Swap L rows (only previously computed entries)
            for j in range(col):
                L[col, j], L[max_row, j] = L[max_row, j], L[col, j]

        # Eliminate
        L[col, col] = 1
        for row in range(col + 1, n):
            factor = U[row, col] / U[col, col]
            L[row, col] = factor
            for j in range(col, n):
                U[row, j] -= factor * U[col, j]

    return P, L, U

def solve_plu(P, L, U, b):
    """Solve Ax = b given PA = LU"""
    # Apply permutation to b
    n = P.rows
    Pb = [sum(P[i, j] * b[j] for j in range(n)) for i in range(n)]
    return solve_lu(L, U, Pb)

4. Matrix Inverse

Computing Inverse via Gaussian Elimination

def matrix_inverse(A):
    """
    Compute A⁻¹ using Gauss-Jordan elimination

    [A | I] → [I | A⁻¹]
    """
    n = A.rows
    if A.rows != A.cols:
        raise ValueError("Matrix must be square")

    # Augmented matrix [A | I]
    augmented = [[A[i,j] for j in range(n)] + [1 if i == j else 0 for j in range(n)]
                 for i in range(n)]

    # Forward elimination with partial pivoting
    for col in range(n):
        # Find pivot
        max_row = col
        for row in range(col + 1, n):
            if abs(augmented[row][col]) > abs(augmented[max_row][col]):
                max_row = row

        if abs(augmented[max_row][col]) < 1e-10:
            raise ValueError("Matrix is singular")

        augmented[col], augmented[max_row] = augmented[max_row], augmented[col]

        # Scale pivot row
        pivot = augmented[col][col]
        for j in range(2 * n):
            augmented[col][j] /= pivot

        # Eliminate column
        for row in range(n):
            if row != col:
                factor = augmented[row][col]
                for j in range(2 * n):
                    augmented[row][j] -= factor * augmented[col][j]

    # Extract inverse
    inverse_data = [[augmented[i][j] for j in range(n, 2*n)] for i in range(n)]
    return Matrix(inverse_data)

# Example
A = Matrix([
    [1, 2],
    [3, 4]
])
A_inv = matrix_inverse(A)
print(f"A⁻¹ =\n{A_inv}")

# Verify: A × A⁻¹ = I
product = A @ A_inv
print(f"A × A⁻¹ =\n{product}")

5. Determinants

Recursive Definition

def determinant(A):
    """
    Compute determinant using cofactor expansion

    O(n!) complexity - not efficient for large matrices
    """
    n = A.rows
    if n != A.cols:
        raise ValueError("Matrix must be square")

    if n == 1:
        return A[0, 0]

    if n == 2:
        return A[0,0] * A[1,1] - A[0,1] * A[1,0]

    det = 0
    for j in range(n):
        # Create minor matrix
        minor_data = []
        for i in range(1, n):
            row = [A[i, k] for k in range(n) if k != j]
            minor_data.append(row)
        minor = Matrix(minor_data)

        # Cofactor expansion along first row
        sign = (-1) ** j
        det += sign * A[0, j] * determinant(minor)

    return det

Efficient Determinant via LU

def determinant_lu(A):
    """
    Compute determinant using LU decomposition

    O(n³) complexity
    det(A) = det(L) × det(U) = product of U diagonal
    """
    try:
        L, U = lu_decomposition(A)
        det = 1.0
        for i in range(A.rows):
            det *= U[i, i]
        return det
    except ValueError:
        return 0.0  # Singular matrix

# Compare
A = Matrix([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 10]
])
print(f"Determinant (recursive): {determinant(A)}")
print(f"Determinant (LU): {determinant_lu(A)}")

6. Numerical Considerations

Condition Number

Measures sensitivity of solution to perturbations.

def condition_number_estimate(A):
    """
    Estimate condition number using power iteration

    High condition number = ill-conditioned (sensitive to errors)
    """
    # This is a simplified estimate
    # Full computation requires singular values

    n = A.rows

    # Estimate ||A|| using power iteration
    x = Vector([1.0] * n)
    for _ in range(20):
        y = A.matvec(x)
        norm_y = y.magnitude()
        if norm_y < 1e-10:
            break
        x = y.normalize()
    norm_A = A.matvec(x).magnitude()

    # Estimate ||A⁻¹|| similarly
    try:
        A_inv = matrix_inverse(A)
        x = Vector([1.0] * n)
        for _ in range(20):
            y = A_inv.matvec(x)
            norm_y = y.magnitude()
            if norm_y < 1e-10:
                break
            x = y.normalize()
        norm_A_inv = A_inv.matvec(x).magnitude()
        return norm_A * norm_A_inv
    except:
        return float('inf')  # Singular

Example of Ill-Conditioning

# Hilbert matrix - notoriously ill-conditioned
def hilbert_matrix(n):
    """Create n×n Hilbert matrix"""
    data = [[1.0 / (i + j + 1) for j in range(n)] for i in range(n)]
    return Matrix(data)

# Show increasing condition number
for n in [2, 3, 4, 5, 6]:
    H = hilbert_matrix(n)
    cond = condition_number_estimate(H)
    print(f"Hilbert {n}×{n}: condition ≈ {cond:.2e}")

7. Special Systems

Tridiagonal Systems

def solve_tridiagonal(a, b, c, d):
    """
    Solve tridiagonal system using Thomas algorithm

    Matrix has form:
    [b₀ c₀  0  0  ...]
    [a₁ b₁ c₁  0  ...]
    [ 0 a₂ b₂ c₂  ...]
    ...

    O(n) time complexity
    """
    n = len(b)

    # Forward sweep
    c_prime = [0.0] * n
    d_prime = [0.0] * n

    c_prime[0] = c[0] / b[0]
    d_prime[0] = d[0] / b[0]

    for i in range(1, n):
        denom = b[i] - a[i] * c_prime[i-1]
        c_prime[i] = c[i] / denom if i < n-1 else 0
        d_prime[i] = (d[i] - a[i] * d_prime[i-1]) / denom

    # Back substitution
    x = [0.0] * n
    x[n-1] = d_prime[n-1]
    for i in range(n-2, -1, -1):
        x[i] = d_prime[i] - c_prime[i] * x[i+1]

    return Vector(x)

# Example: Solve tridiagonal system
# [2 -1  0  0] [x1]   [1]
# [-1 2 -1  0] [x2] = [0]
# [0 -1  2 -1] [x3]   [0]
# [0  0 -1  2] [x4]   [1]

a = [0, -1, -1, -1]  # Lower diagonal (a[0] unused)
b = [2, 2, 2, 2]      # Main diagonal
c = [-1, -1, -1, 0]   # Upper diagonal (c[n-1] unused)
d = [1, 0, 0, 1]      # Right-hand side

x = solve_tridiagonal(a, b, c, d)
print(f"Solution: {x}")

Symmetric Positive Definite: Cholesky Decomposition

def cholesky(A):
    """
    Cholesky decomposition A = LL^T

    Only works for symmetric positive definite matrices
    Half the work of LU
    """
    n = A.rows
    L = Matrix.zeros(n, n)

    for i in range(n):
        for j in range(i + 1):
            sum_val = sum(L[i, k] * L[j, k] for k in range(j))

            if i == j:
                val = A[i, i] - sum_val
                if val <= 0:
                    raise ValueError("Matrix is not positive definite")
                L[i, j] = math.sqrt(val)
            else:
                L[i, j] = (A[i, j] - sum_val) / L[j, j]

    return L

# Example with symmetric positive definite matrix
A = Matrix([
    [4, 2, 2],
    [2, 5, 3],
    [2, 3, 6]
])
L = cholesky(A)
print(f"L =\n{L}")

# Verify: A = LL^T
LLT = L @ L.transpose()
print(f"LL^T =\n{LLT}")

Exercises

Basic

  1. Solve the system using Gaussian elimination:

    x + y + z = 6
    2x + y - z = 1
    x - y + 2z = 5
    
  2. Compute the LU decomposition of a 3×3 matrix by hand, then verify with code.

  3. Write a function to compute the determinant of a 3×3 matrix using Sarrus' rule.

Intermediate

  1. Implement Gauss-Jordan elimination to compute the reduced row echelon form (RREF).

  2. Implement a function to compute the rank of a matrix.

  3. Compare the numerical accuracy of solving a system with and without partial pivoting for an ill-conditioned matrix.

Advanced

  1. Implement iterative refinement to improve the accuracy of a solution.

  2. Implement the Gauss-Seidel iterative method for solving sparse systems.

  3. Compare the performance of LU vs Cholesky for solving multiple systems with the same coefficient matrix.


Summary

  • Gaussian elimination transforms systems to row echelon form
  • Partial pivoting improves numerical stability
  • LU decomposition factors A = LU for efficient solving of multiple systems
  • Cholesky decomposition is faster for symmetric positive definite matrices
  • Condition number indicates sensitivity to numerical errors
  • Special structures (tridiagonal, sparse) enable faster algorithms

Next Reading

Eigenvalues and Eigenvectors →