Linear Systems and Gaussian Elimination
Introduction
Solving systems of linear equations is one of the most important applications of linear algebra. From circuit analysis to computer graphics to machine learning, linear systems appear everywhere. This reading covers methods for solving Ax = b.
Learning Objectives
By the end of this reading, you will be able to:
- Represent linear systems in matrix form
- Apply Gaussian elimination with partial pivoting
- Understand LU decomposition
- Analyze numerical stability
- Recognize special cases (singular, overdetermined, underdetermined systems)
1. Linear Systems
Matrix Form
A system of linear equations:
a₁₁x₁ + a₁₂x₂ + ... + a₁ₙxₙ = b₁
a₂₁x₁ + a₂₂x₂ + ... + a₂ₙxₙ = b₂
...
aₘ₁x₁ + aₘ₂x₂ + ... + aₘₙxₙ = bₘ
Can be written as Ax = b where:
- A is m×n coefficient matrix
- x is n×1 unknown vector
- b is m×1 constant vector
def print_system(A, b):
"""Display system of equations"""
n = A.cols
variables = [f"x{i+1}" for i in range(n)]
for i in range(A.rows):
terms = []
for j in range(A.cols):
coef = A[i,j]
if coef != 0:
if coef == 1:
terms.append(f"{variables[j]}")
elif coef == -1:
terms.append(f"-{variables[j]}")
else:
terms.append(f"{coef}{variables[j]}")
equation = " + ".join(terms).replace("+ -", "- ")
print(f"{equation} = {b[i]}")
# Example
A = Matrix([
[2, 1, -1],
[-3, -1, 2],
[-2, 1, 2]
])
b = [8, -11, -3]
print_system(A, b)
# 2x1 + x2 - x3 = 8
# -3x1 - x2 + 2x3 = -11
# -2x1 + x2 + 2x3 = -3
2. Gaussian Elimination
Basic Algorithm
Transform to row echelon form using elementary row operations:
- Swap rows
- Multiply row by non-zero scalar
- Add multiple of one row to another
def gaussian_elimination(A, b):
"""
Solve Ax = b using Gaussian elimination with back substitution
Returns solution vector x or None if no unique solution
"""
n = A.rows
# Create augmented matrix [A|b]
augmented = [[A[i,j] for j in range(A.cols)] + [b[i]] for i in range(n)]
# Forward elimination
for col in range(n):
# Find pivot
max_row = col
for row in range(col + 1, n):
if abs(augmented[row][col]) > abs(augmented[max_row][col]):
max_row = row
# Swap rows
augmented[col], augmented[max_row] = augmented[max_row], augmented[col]
# Check for zero pivot
if abs(augmented[col][col]) < 1e-10:
return None # Singular or no unique solution
# Eliminate below pivot
for row in range(col + 1, n):
factor = augmented[row][col] / augmented[col][col]
for j in range(col, n + 1):
augmented[row][j] -= factor * augmented[col][j]
# Back substitution
x = [0] * n
for i in range(n - 1, -1, -1):
x[i] = augmented[i][n]
for j in range(i + 1, n):
x[i] -= augmented[i][j] * x[j]
x[i] /= augmented[i][i]
return Vector(x)
# Solve the example system
A = Matrix([
[2, 1, -1],
[-3, -1, 2],
[-2, 1, 2]
])
b = [8, -11, -3]
x = gaussian_elimination(A, b)
print(f"Solution: {x}") # Should be approximately [2, 3, -1]
# Verify
result = A.matvec(x)
print(f"Verification Ax = {result}") # Should equal b
Partial Pivoting
Always swap to bring largest absolute value to pivot position:
def gaussian_elimination_partial_pivot(A, b):
"""
Gaussian elimination with partial pivoting for numerical stability
"""
n = A.rows
augmented = [[A[i,j] for j in range(A.cols)] + [b[i]] for i in range(n)]
for col in range(n):
# Partial pivoting: find row with largest absolute value in column
max_row = col
max_val = abs(augmented[col][col])
for row in range(col + 1, n):
if abs(augmented[row][col]) > max_val:
max_val = abs(augmented[row][col])
max_row = row
if max_val < 1e-10:
raise ValueError("Matrix is singular")
# Swap rows
augmented[col], augmented[max_row] = augmented[max_row], augmented[col]
# Eliminate
pivot = augmented[col][col]
for row in range(col + 1, n):
factor = augmented[row][col] / pivot
augmented[row][col] = 0 # Exactly zero
for j in range(col + 1, n + 1):
augmented[row][j] -= factor * augmented[col][j]
# Back substitution
x = [0.0] * n
for i in range(n - 1, -1, -1):
x[i] = augmented[i][n]
for j in range(i + 1, n):
x[i] -= augmented[i][j] * x[j]
x[i] /= augmented[i][i]
return Vector(x)
3. LU Decomposition
Concept
Factor A = LU where:
- L is lower triangular (1s on diagonal)
- U is upper triangular
Then solve Ax = b as:
- Ly = b (forward substitution)
- Ux = y (back substitution)
def lu_decomposition(A):
"""
LU decomposition without pivoting
Returns L, U such that A = LU
"""
n = A.rows
L = Matrix.identity(n)
U = Matrix([[A[i,j] for j in range(n)] for i in range(n)])
for col in range(n):
if abs(U[col, col]) < 1e-10:
raise ValueError("Zero pivot encountered")
for row in range(col + 1, n):
factor = U[row, col] / U[col, col]
L[row, col] = factor
for j in range(col, n):
U[row, j] -= factor * U[col, j]
return L, U
def solve_lu(L, U, b):
"""Solve Ax = b given A = LU"""
n = L.rows
# Forward substitution: Ly = b
y = [0.0] * n
for i in range(n):
y[i] = b[i]
for j in range(i):
y[i] -= L[i, j] * y[j]
# L has 1s on diagonal
# Back substitution: Ux = y
x = [0.0] * n
for i in range(n - 1, -1, -1):
x[i] = y[i]
for j in range(i + 1, n):
x[i] -= U[i, j] * x[j]
x[i] /= U[i, i]
return Vector(x)
# Example
A = Matrix([
[2, 1, 1],
[4, 3, 3],
[8, 7, 9]
])
L, U = lu_decomposition(A)
print(f"L =\n{L}")
print(f"U =\n{U}")
# Verify A = LU
LU = L @ U
print(f"LU =\n{LU}")
# Solve system
b = [4, 10, 24]
x = solve_lu(L, U, b)
print(f"Solution: {x}")
LU with Partial Pivoting (PLU)
For numerical stability, compute PA = LU where P is a permutation matrix.
def plu_decomposition(A):
"""
LU decomposition with partial pivoting
Returns P, L, U such that PA = LU
"""
n = A.rows
# Work with copies
U = Matrix([[A[i,j] for j in range(n)] for i in range(n)])
L = Matrix.zeros(n, n)
P = Matrix.identity(n)
for col in range(n):
# Find pivot
max_row = col
max_val = abs(U[col, col])
for row in range(col + 1, n):
if abs(U[row, col]) > max_val:
max_val = abs(U[row, col])
max_row = row
if max_val < 1e-10:
raise ValueError("Matrix is singular")
# Swap rows in U, P, and L
if max_row != col:
# Swap U rows
for j in range(n):
U[col, j], U[max_row, j] = U[max_row, j], U[col, j]
# Swap P rows
for j in range(n):
P[col, j], P[max_row, j] = P[max_row, j], P[col, j]
# Swap L rows (only previously computed entries)
for j in range(col):
L[col, j], L[max_row, j] = L[max_row, j], L[col, j]
# Eliminate
L[col, col] = 1
for row in range(col + 1, n):
factor = U[row, col] / U[col, col]
L[row, col] = factor
for j in range(col, n):
U[row, j] -= factor * U[col, j]
return P, L, U
def solve_plu(P, L, U, b):
"""Solve Ax = b given PA = LU"""
# Apply permutation to b
n = P.rows
Pb = [sum(P[i, j] * b[j] for j in range(n)) for i in range(n)]
return solve_lu(L, U, Pb)
4. Matrix Inverse
Computing Inverse via Gaussian Elimination
def matrix_inverse(A):
"""
Compute A⁻¹ using Gauss-Jordan elimination
[A | I] → [I | A⁻¹]
"""
n = A.rows
if A.rows != A.cols:
raise ValueError("Matrix must be square")
# Augmented matrix [A | I]
augmented = [[A[i,j] for j in range(n)] + [1 if i == j else 0 for j in range(n)]
for i in range(n)]
# Forward elimination with partial pivoting
for col in range(n):
# Find pivot
max_row = col
for row in range(col + 1, n):
if abs(augmented[row][col]) > abs(augmented[max_row][col]):
max_row = row
if abs(augmented[max_row][col]) < 1e-10:
raise ValueError("Matrix is singular")
augmented[col], augmented[max_row] = augmented[max_row], augmented[col]
# Scale pivot row
pivot = augmented[col][col]
for j in range(2 * n):
augmented[col][j] /= pivot
# Eliminate column
for row in range(n):
if row != col:
factor = augmented[row][col]
for j in range(2 * n):
augmented[row][j] -= factor * augmented[col][j]
# Extract inverse
inverse_data = [[augmented[i][j] for j in range(n, 2*n)] for i in range(n)]
return Matrix(inverse_data)
# Example
A = Matrix([
[1, 2],
[3, 4]
])
A_inv = matrix_inverse(A)
print(f"A⁻¹ =\n{A_inv}")
# Verify: A × A⁻¹ = I
product = A @ A_inv
print(f"A × A⁻¹ =\n{product}")
5. Determinants
Recursive Definition
def determinant(A):
"""
Compute determinant using cofactor expansion
O(n!) complexity - not efficient for large matrices
"""
n = A.rows
if n != A.cols:
raise ValueError("Matrix must be square")
if n == 1:
return A[0, 0]
if n == 2:
return A[0,0] * A[1,1] - A[0,1] * A[1,0]
det = 0
for j in range(n):
# Create minor matrix
minor_data = []
for i in range(1, n):
row = [A[i, k] for k in range(n) if k != j]
minor_data.append(row)
minor = Matrix(minor_data)
# Cofactor expansion along first row
sign = (-1) ** j
det += sign * A[0, j] * determinant(minor)
return det
Efficient Determinant via LU
def determinant_lu(A):
"""
Compute determinant using LU decomposition
O(n³) complexity
det(A) = det(L) × det(U) = product of U diagonal
"""
try:
L, U = lu_decomposition(A)
det = 1.0
for i in range(A.rows):
det *= U[i, i]
return det
except ValueError:
return 0.0 # Singular matrix
# Compare
A = Matrix([
[1, 2, 3],
[4, 5, 6],
[7, 8, 10]
])
print(f"Determinant (recursive): {determinant(A)}")
print(f"Determinant (LU): {determinant_lu(A)}")
6. Numerical Considerations
Condition Number
Measures sensitivity of solution to perturbations.
def condition_number_estimate(A):
"""
Estimate condition number using power iteration
High condition number = ill-conditioned (sensitive to errors)
"""
# This is a simplified estimate
# Full computation requires singular values
n = A.rows
# Estimate ||A|| using power iteration
x = Vector([1.0] * n)
for _ in range(20):
y = A.matvec(x)
norm_y = y.magnitude()
if norm_y < 1e-10:
break
x = y.normalize()
norm_A = A.matvec(x).magnitude()
# Estimate ||A⁻¹|| similarly
try:
A_inv = matrix_inverse(A)
x = Vector([1.0] * n)
for _ in range(20):
y = A_inv.matvec(x)
norm_y = y.magnitude()
if norm_y < 1e-10:
break
x = y.normalize()
norm_A_inv = A_inv.matvec(x).magnitude()
return norm_A * norm_A_inv
except:
return float('inf') # Singular
Example of Ill-Conditioning
# Hilbert matrix - notoriously ill-conditioned
def hilbert_matrix(n):
"""Create n×n Hilbert matrix"""
data = [[1.0 / (i + j + 1) for j in range(n)] for i in range(n)]
return Matrix(data)
# Show increasing condition number
for n in [2, 3, 4, 5, 6]:
H = hilbert_matrix(n)
cond = condition_number_estimate(H)
print(f"Hilbert {n}×{n}: condition ≈ {cond:.2e}")
7. Special Systems
Tridiagonal Systems
def solve_tridiagonal(a, b, c, d):
"""
Solve tridiagonal system using Thomas algorithm
Matrix has form:
[b₀ c₀ 0 0 ...]
[a₁ b₁ c₁ 0 ...]
[ 0 a₂ b₂ c₂ ...]
...
O(n) time complexity
"""
n = len(b)
# Forward sweep
c_prime = [0.0] * n
d_prime = [0.0] * n
c_prime[0] = c[0] / b[0]
d_prime[0] = d[0] / b[0]
for i in range(1, n):
denom = b[i] - a[i] * c_prime[i-1]
c_prime[i] = c[i] / denom if i < n-1 else 0
d_prime[i] = (d[i] - a[i] * d_prime[i-1]) / denom
# Back substitution
x = [0.0] * n
x[n-1] = d_prime[n-1]
for i in range(n-2, -1, -1):
x[i] = d_prime[i] - c_prime[i] * x[i+1]
return Vector(x)
# Example: Solve tridiagonal system
# [2 -1 0 0] [x1] [1]
# [-1 2 -1 0] [x2] = [0]
# [0 -1 2 -1] [x3] [0]
# [0 0 -1 2] [x4] [1]
a = [0, -1, -1, -1] # Lower diagonal (a[0] unused)
b = [2, 2, 2, 2] # Main diagonal
c = [-1, -1, -1, 0] # Upper diagonal (c[n-1] unused)
d = [1, 0, 0, 1] # Right-hand side
x = solve_tridiagonal(a, b, c, d)
print(f"Solution: {x}")
Symmetric Positive Definite: Cholesky Decomposition
def cholesky(A):
"""
Cholesky decomposition A = LL^T
Only works for symmetric positive definite matrices
Half the work of LU
"""
n = A.rows
L = Matrix.zeros(n, n)
for i in range(n):
for j in range(i + 1):
sum_val = sum(L[i, k] * L[j, k] for k in range(j))
if i == j:
val = A[i, i] - sum_val
if val <= 0:
raise ValueError("Matrix is not positive definite")
L[i, j] = math.sqrt(val)
else:
L[i, j] = (A[i, j] - sum_val) / L[j, j]
return L
# Example with symmetric positive definite matrix
A = Matrix([
[4, 2, 2],
[2, 5, 3],
[2, 3, 6]
])
L = cholesky(A)
print(f"L =\n{L}")
# Verify: A = LL^T
LLT = L @ L.transpose()
print(f"LL^T =\n{LLT}")
Exercises
Basic
Solve the system using Gaussian elimination:
x + y + z = 6 2x + y - z = 1 x - y + 2z = 5Compute the LU decomposition of a 3×3 matrix by hand, then verify with code.
Write a function to compute the determinant of a 3×3 matrix using Sarrus' rule.
Intermediate
Implement Gauss-Jordan elimination to compute the reduced row echelon form (RREF).
Implement a function to compute the rank of a matrix.
Compare the numerical accuracy of solving a system with and without partial pivoting for an ill-conditioned matrix.
Advanced
Implement iterative refinement to improve the accuracy of a solution.
Implement the Gauss-Seidel iterative method for solving sparse systems.
Compare the performance of LU vs Cholesky for solving multiple systems with the same coefficient matrix.
Summary
- Gaussian elimination transforms systems to row echelon form
- Partial pivoting improves numerical stability
- LU decomposition factors A = LU for efficient solving of multiple systems
- Cholesky decomposition is faster for symmetric positive definite matrices
- Condition number indicates sensitivity to numerical errors
- Special structures (tridiagonal, sparse) enable faster algorithms