Eigenvalues and Eigenvectors
Introduction
Eigenvalues and eigenvectors reveal the fundamental behavior of linear transformations. They're essential for principal component analysis (PCA), Google's PageRank, quantum mechanics, vibration analysis, and many more applications. This reading covers the theory and computation of eigenvalues and eigenvectors.
Learning Objectives
By the end of this reading, you will be able to:
- Define eigenvalues and eigenvectors geometrically and algebraically
- Compute eigenvalues and eigenvectors for small matrices
- Implement the power iteration algorithm
- Apply eigendecomposition to solve problems
- Understand the singular value decomposition (SVD)
1. Definitions
Geometric Intuition
When we apply a matrix A to most vectors, both direction and magnitude change. But for special vectors called eigenvectors, A only scales them:
Av = λv
- v is an eigenvector (v ≠ 0)
- λ is the corresponding eigenvalue (scaling factor)
import math
# Example: Scaling matrix
A = Matrix([
[2, 0],
[0, 3]
])
v1 = Vector([1, 0]) # Eigenvector with λ=2
v2 = Vector([0, 1]) # Eigenvector with λ=3
v3 = Vector([1, 1]) # Not an eigenvector
print(f"A × v1 = {A.matvec(v1)}") # [2, 0] = 2 × [1, 0] ✓
print(f"A × v2 = {A.matvec(v2)}") # [0, 3] = 3 × [0, 1] ✓
print(f"A × v3 = {A.matvec(v3)}") # [2, 3] ≠ λ × [1, 1] for any λ
Algebraic Definition
From Av = λv, we get:
Av - λv = 0
(A - λI)v = 0
For non-zero v, (A - λI) must be singular, so:
det(A - λI) = 0
This is the characteristic equation. The polynomial det(A - λI) is the characteristic polynomial.
2. Computing Eigenvalues (Small Matrices)
2×2 Matrix
def eigenvalues_2x2(A):
"""
Compute eigenvalues of 2×2 matrix analytically
Characteristic polynomial: λ² - trace(A)λ + det(A) = 0
"""
if A.rows != 2 or A.cols != 2:
raise ValueError("Matrix must be 2×2")
trace = A[0,0] + A[1,1]
det = A[0,0] * A[1,1] - A[0,1] * A[1,0]
# Quadratic formula
discriminant = trace**2 - 4*det
if discriminant >= 0:
sqrt_d = math.sqrt(discriminant)
lambda1 = (trace + sqrt_d) / 2
lambda2 = (trace - sqrt_d) / 2
return lambda1, lambda2
else:
# Complex eigenvalues
real = trace / 2
imag = math.sqrt(-discriminant) / 2
return (real, imag), (real, -imag) # Return as tuples
def eigenvector_2x2(A, eigenvalue):
"""
Compute eigenvector for given eigenvalue of 2×2 matrix
Solve (A - λI)v = 0
"""
a = A[0,0] - eigenvalue
b = A[0,1]
c = A[1,0]
d = A[1,1] - eigenvalue
# Find null space of [a b; c d]
# If a ≠ 0, v = [-b, a] works
# If a = 0 but b ≠ 0, v = [1, 0] works
if abs(a) > 1e-10:
v = Vector([-b, a])
elif abs(b) > 1e-10:
v = Vector([1, 0])
elif abs(c) > 1e-10:
v = Vector([1, 0])
else:
v = Vector([0, 1])
return v.normalize()
# Example
A = Matrix([
[4, 2],
[1, 3]
])
eigenvalues = eigenvalues_2x2(A)
print(f"Eigenvalues: {eigenvalues}")
for lam in eigenvalues:
if isinstance(lam, (int, float)):
v = eigenvector_2x2(A, lam)
Av = A.matvec(v)
lv = lam * v
print(f"λ = {lam:.4f}, v = {v}")
print(f" Av = {Av}, λv = {lv}")
Properties of Eigenvalues
"""
Key Properties:
1. Sum of eigenvalues = trace(A) = Σ Aᵢᵢ
2. Product of eigenvalues = det(A)
3. If A is triangular, eigenvalues are diagonal entries
4. Real symmetric matrices have real eigenvalues
5. λ(A⁻¹) = 1/λ(A)
6. λ(Aᵏ) = λ(A)ᵏ
7. λ(A + cI) = λ(A) + c
"""
def verify_eigenvalue_properties(A, eigenvalues):
"""Verify trace and determinant properties"""
trace = sum(A[i,i] for i in range(A.rows))
det = determinant(A)
eig_sum = sum(eigenvalues)
eig_prod = 1
for e in eigenvalues:
eig_prod *= e
print(f"Trace: {trace}, Sum of eigenvalues: {eig_sum}")
print(f"Det: {det}, Product of eigenvalues: {eig_prod}")
3. Power Iteration
Algorithm
Find the largest eigenvalue and its eigenvector iteratively.
def power_iteration(A, num_iterations=100, tolerance=1e-10):
"""
Find dominant eigenvalue and eigenvector using power iteration
Converges if dominant eigenvalue is unique and larger than others
Convergence rate: |λ₂/λ₁|
"""
n = A.rows
# Start with random vector
v = Vector([1.0] * n).normalize()
eigenvalue = 0
for i in range(num_iterations):
# Multiply by A
Av = A.matvec(v)
# New eigenvalue estimate (Rayleigh quotient)
new_eigenvalue = v.dot(Av)
# Normalize
norm = Av.magnitude()
if norm < 1e-10:
break
v = Av.normalize()
# Check convergence
if abs(new_eigenvalue - eigenvalue) < tolerance:
break
eigenvalue = new_eigenvalue
return eigenvalue, v
# Example
A = Matrix([
[4, 2],
[1, 3]
])
lam, v = power_iteration(A)
print(f"Dominant eigenvalue: {lam:.6f}")
print(f"Eigenvector: {v}")
# Verify
Av = A.matvec(v)
print(f"Av = {Av}")
print(f"λv = {lam * v}")
Inverse Iteration
Find eigenvalue closest to a given value σ.
def inverse_iteration(A, sigma, num_iterations=100, tolerance=1e-10):
"""
Find eigenvalue closest to sigma using inverse iteration
Works by applying power iteration to (A - σI)⁻¹
"""
n = A.rows
# Compute (A - σI)
shifted = Matrix([[A[i,j] - (sigma if i == j else 0)
for j in range(n)] for i in range(n)])
# Start with random vector
v = Vector([1.0] * n).normalize()
eigenvalue = sigma
for iteration in range(num_iterations):
# Solve (A - σI)w = v
try:
w = gaussian_elimination(shifted, v.components)
except:
break
# New eigenvalue estimate
new_eigenvalue = sigma + 1.0 / v.dot(w)
# Normalize
v = w.normalize()
if abs(new_eigenvalue - eigenvalue) < tolerance:
break
eigenvalue = new_eigenvalue
return eigenvalue, v
# Find eigenvalue closest to 3
lam, v = inverse_iteration(A, 3.0)
print(f"Eigenvalue closest to 3: {lam:.6f}")
QR Algorithm (Conceptual)
The standard algorithm for finding all eigenvalues.
def qr_decomposition_simple(A):
"""
Simple QR decomposition using Gram-Schmidt
A = QR where Q is orthogonal, R is upper triangular
"""
n = A.rows
Q = Matrix.zeros(n, n)
R = Matrix.zeros(n, n)
for j in range(n):
# Get column j of A
v = Vector([A[i, j] for i in range(n)])
# Subtract projections onto previous q vectors
for i in range(j):
q_i = Vector([Q[k, i] for k in range(n)])
R[i, j] = q_i.dot(v)
v = v - R[i, j] * q_i
R[j, j] = v.magnitude()
if R[j, j] > 1e-10:
for i in range(n):
Q[i, j] = v[i] / R[j, j]
return Q, R
def qr_algorithm(A, num_iterations=100):
"""
QR algorithm for finding all eigenvalues
Iteratively: A = QR, then A = RQ
Converges to upper triangular with eigenvalues on diagonal
"""
Ak = Matrix([[A[i,j] for j in range(A.cols)] for i in range(A.rows)])
for _ in range(num_iterations):
Q, R = qr_decomposition_simple(Ak)
Ak = R @ Q
# Eigenvalues are on diagonal
eigenvalues = [Ak[i, i] for i in range(Ak.rows)]
return eigenvalues
# Example
A = Matrix([
[4, 2],
[1, 3]
])
eigenvalues = qr_algorithm(A)
print(f"Eigenvalues (QR): {eigenvalues}")
4. Eigendecomposition
Diagonalization
If A has n linearly independent eigenvectors, then:
A = VΛV⁻¹
where:
- V = [v₁ | v₂ | ... | vₙ] (eigenvectors as columns)
- Λ = diag(λ₁, λ₂, ..., λₙ) (eigenvalues on diagonal)
def eigendecomposition_2x2(A):
"""
Eigendecomposition for 2×2 matrix
Returns V, Lambda such that A = V @ Lambda @ V⁻¹
"""
eigenvalues = eigenvalues_2x2(A)
# Check for complex eigenvalues
if not all(isinstance(e, (int, float)) for e in eigenvalues):
raise ValueError("Complex eigenvalues not supported")
v1 = eigenvector_2x2(A, eigenvalues[0])
v2 = eigenvector_2x2(A, eigenvalues[1])
V = Matrix([
[v1[0], v2[0]],
[v1[1], v2[1]]
])
Lambda = Matrix.diagonal(list(eigenvalues))
return V, Lambda
# Example
A = Matrix([
[4, 2],
[1, 3]
])
V, Lambda = eigendecomposition_2x2(A)
print(f"V =\n{V}")
print(f"Λ =\n{Lambda}")
# Verify A = VΛV⁻¹
V_inv = matrix_inverse(V)
reconstructed = V @ Lambda @ V_inv
print(f"VΛV⁻¹ =\n{reconstructed}")
Matrix Powers via Eigendecomposition
def matrix_power(A, k):
"""
Compute A^k efficiently using eigendecomposition
A^k = V Λ^k V⁻¹
"""
V, Lambda = eigendecomposition_2x2(A)
V_inv = matrix_inverse(V)
# Raise eigenvalues to power k
Lambda_k = Matrix.diagonal([Lambda[i,i]**k for i in range(Lambda.rows)])
return V @ Lambda_k @ V_inv
# Example: A^10
A = Matrix([
[1.1, 0.1],
[0.1, 0.9]
])
A_10 = matrix_power(A, 10)
print(f"A^10 =\n{A_10}")
# Compare with direct computation
A_direct = A
for _ in range(9):
A_direct = A_direct @ A
print(f"A^10 (direct) =\n{A_direct}")
5. Symmetric Matrices
Symmetric matrices have special properties that make them easier to work with.
Spectral Theorem
For real symmetric A:
- All eigenvalues are real
- Eigenvectors can be chosen to be orthonormal
- A = QΛQᵀ where Q is orthogonal (Q⁻¹ = Qᵀ)
def verify_symmetric_properties(A):
"""Verify properties of symmetric matrix eigenvalues"""
if not A.is_symmetric():
print("Matrix is not symmetric")
return
eigenvalues = qr_algorithm(A, num_iterations=200)
print(f"Eigenvalues: {eigenvalues}")
print(f"All real: {all(isinstance(e, (int, float)) for e in eigenvalues)}")
# Example
A = Matrix([
[4, 2, 1],
[2, 5, 3],
[1, 3, 6]
])
verify_symmetric_properties(A)
Rayleigh Quotient
For symmetric A:
R(x) = (xᵀAx) / (xᵀx)
The Rayleigh quotient gives eigenvalue when x is eigenvector, and is bounded by extreme eigenvalues otherwise.
def rayleigh_quotient(A, x):
"""Compute Rayleigh quotient for symmetric A"""
Ax = A.matvec(x)
return x.dot(Ax) / x.dot(x)
# For any x, λ_min ≤ R(x) ≤ λ_max
6. Singular Value Decomposition (SVD)
Definition
Any m×n matrix A can be decomposed as:
A = UΣVᵀ
where:
- U is m×m orthogonal (left singular vectors)
- Σ is m×n diagonal (singular values σ₁ ≥ σ₂ ≥ ... ≥ 0)
- V is n×n orthogonal (right singular vectors)
def svd_2x2(A):
"""
SVD for 2×2 matrix
Uses relationship: AᵀA has eigenvectors V and eigenvalues σ²
"""
# Compute AᵀA
At = A.transpose()
AtA = At @ A
# Eigenvalues of AᵀA are squared singular values
eigenvalues = eigenvalues_2x2(AtA)
# Singular values (square roots)
singular_values = [math.sqrt(max(0, e)) for e in sorted(eigenvalues, reverse=True)]
# V columns are eigenvectors of AᵀA
v1 = eigenvector_2x2(AtA, eigenvalues[0])
v2 = eigenvector_2x2(AtA, eigenvalues[1])
V = Matrix([
[v1[0], v2[0]],
[v1[1], v2[1]]
])
# U columns: u_i = Av_i / σ_i
Sigma = Matrix.diagonal(singular_values)
u_cols = []
for i, sigma in enumerate(singular_values):
v_i = Vector([V[j, i] for j in range(V.rows)])
if sigma > 1e-10:
u_i = (1/sigma) * A.matvec(v_i)
else:
# For zero singular value, choose orthogonal to existing
u_i = Vector([1, 0]) if i == 0 else Vector([0, 1])
u_cols.append(u_i)
U = Matrix([
[u_cols[0][0], u_cols[1][0]],
[u_cols[0][1], u_cols[1][1]]
])
return U, Sigma, V.transpose()
# Example
A = Matrix([
[3, 2],
[2, 3]
])
U, Sigma, Vt = svd_2x2(A)
print(f"U =\n{U}")
print(f"Σ =\n{Sigma}")
print(f"Vᵀ =\n{Vt}")
# Verify A = UΣVᵀ
reconstructed = U @ Sigma @ Vt
print(f"UΣVᵀ =\n{reconstructed}")
Applications of SVD
# 1. Low-rank approximation
def truncated_svd(U, Sigma, Vt, k):
"""Keep only top k singular values"""
U_k = Matrix([[U[i,j] for j in range(k)] for i in range(U.rows)])
Sigma_k = Matrix([[Sigma[i,j] if i < k and j < k else 0
for j in range(k)] for i in range(k)])
Vt_k = Matrix([[Vt[i,j] for j in range(Vt.cols)] for i in range(k)])
return U_k @ Sigma_k @ Vt_k
# 2. Pseudoinverse
def pseudoinverse_via_svd(U, Sigma, Vt):
"""Compute A⁺ = VΣ⁺Uᵀ"""
# Σ⁺: invert non-zero diagonal elements
Sigma_plus = Matrix.zeros(Sigma.cols, Sigma.rows)
for i in range(min(Sigma.rows, Sigma.cols)):
if Sigma[i,i] > 1e-10:
Sigma_plus[i,i] = 1.0 / Sigma[i,i]
V = Vt.transpose()
Ut = U.transpose()
return V @ Sigma_plus @ Ut
# 3. Condition number
def condition_number_svd(Sigma):
"""σ_max / σ_min"""
sigmas = [Sigma[i,i] for i in range(min(Sigma.rows, Sigma.cols))]
sigmas = [s for s in sigmas if s > 1e-10]
if len(sigmas) < 2:
return float('inf')
return max(sigmas) / min(sigmas)
7. Applications
Principal Component Analysis (PCA)
def pca(data, n_components=2):
"""
Principal Component Analysis
data: list of data points (as Vectors)
Returns: principal components and projected data
"""
n = len(data)
d = len(data[0])
# Center the data
mean = Vector([sum(data[i][j] for i in range(n)) / n for j in range(d)])
centered = [Vector([data[i][j] - mean[j] for j in range(d)]) for i in range(n)]
# Compute covariance matrix
cov = Matrix.zeros(d, d)
for point in centered:
for i in range(d):
for j in range(d):
cov[i,j] += point[i] * point[j] / (n - 1)
# Get eigenvalues/eigenvectors (using power iteration for largest)
# In practice, use SVD on centered data matrix
# For 2D data, compute analytically
if d == 2:
eigenvalues = eigenvalues_2x2(cov)
pc1 = eigenvector_2x2(cov, max(eigenvalues))
pc2 = eigenvector_2x2(cov, min(eigenvalues))
# Project data
projected = []
for point in centered:
proj = Vector([point.dot(pc1), point.dot(pc2)])
projected.append(proj)
return [pc1, pc2], projected, eigenvalues
return None
# Example
import random
data = [Vector([random.gauss(0, 1) + 0.5*random.gauss(0, 1),
random.gauss(0, 1) - 0.5*random.gauss(0, 1)])
for _ in range(100)]
pcs, projected, variances = pca(data)
print(f"Principal components: {pcs}")
print(f"Variance explained: {variances}")
Google PageRank
def pagerank(adjacency, damping=0.85, iterations=100):
"""
PageRank algorithm using power iteration
adjacency[i][j] = 1 if page j links to page i
"""
n = len(adjacency)
# Build transition matrix
M = Matrix.zeros(n, n)
for j in range(n):
out_degree = sum(adjacency[i][j] for i in range(n))
if out_degree > 0:
for i in range(n):
if adjacency[i][j]:
M[i,j] = 1.0 / out_degree
else:
# Dangling node: distribute evenly
for i in range(n):
M[i,j] = 1.0 / n
# Apply damping
# G = d*M + (1-d)/n * ones
G = Matrix([[damping * M[i,j] + (1-damping)/n
for j in range(n)] for i in range(n)])
# Power iteration
r = Vector([1.0/n] * n)
for _ in range(iterations):
r = G.matvec(r)
return r
# Example: Simple web graph
# 0 → 1 → 2
# ↑ ↓ ↓
# └───3 ←─┘
adj = [
[0, 0, 0, 1], # 3 links to 0
[1, 0, 0, 0], # 0 links to 1
[0, 1, 0, 0], # 1 links to 2
[0, 1, 1, 0] # 1,2 link to 3
]
ranks = pagerank(adj)
print(f"PageRank scores: {ranks}")
Exercises
Basic
Find eigenvalues and eigenvectors of [[3, 1], [0, 2]] by hand.
Verify that eigenvectors of a symmetric matrix corresponding to different eigenvalues are orthogonal.
Show that similar matrices (B = P⁻¹AP) have the same eigenvalues.
Intermediate
Implement power iteration with deflation to find all eigenvalues.
Compute the SVD of a 3×2 matrix and verify A = UΣVᵀ.
Implement PCA and apply it to a dataset. Visualize the principal components.
Advanced
Implement the QR algorithm with shifts for faster convergence.
Use SVD to implement image compression (represent image as matrix, keep top k singular values).
Implement a spectral clustering algorithm using eigenvectors of the graph Laplacian.
Summary
- Eigenvectors are directions preserved by a transformation (only scaled)
- Eigenvalues are the scaling factors
- Power iteration finds the dominant eigenvalue
- Eigendecomposition A = VΛV⁻¹ enables efficient matrix powers
- Symmetric matrices have real eigenvalues and orthogonal eigenvectors
- SVD A = UΣVᵀ works for any matrix, enables low-rank approximation
- Applications: PCA, PageRank, image compression, stability analysis
Module Complete
This completes the Linear Algebra module! You've learned:
- Vectors and matrices with their operations
- Solving linear systems with Gaussian elimination and LU decomposition
- Eigenvalues, eigenvectors, and the SVD
These tools are fundamental to machine learning, graphics, scientific computing, and data analysis.