Av = λv. Vectors that only scale under transformation.
Deep-dive lesson - accessible entry point but dense material. Use worked examples and spaced repetition.
A matrix can rotate, stretch, shear, and reflect space. Yet for many transformations there are special directions that behave unusually simply: vectors that don’t “turn” at all—they only scale. Those directions (eigenvectors) and their scale factors (eigenvalues) are the key to understanding long‑term dynamics, stability, and diagonalization.
Eigenvectors are nonzero vectors v such that Av = λv. They point in directions left unchanged by A (only scaled). Eigenvalues λ are found by solving det(A − λI) = 0. Once you have them, you can analyze repeated application of A, decouple coupled systems, and connect linear algebra to Markov chains, SVD, and graphs.
Linear transformations can be complicated. A general 2D transformation might rotate vectors, stretch them unequally, and shear them so that angles change. If you want to predict what happens after applying the transformation many times—A², A³, …—the complexity compounds.
Eigenvectors and eigenvalues give you simple “coordinate axes” (when they exist) in which the transformation behaves like independent scaling along special directions. This is the central idea behind diagonalization, stability analysis, and many algorithms.
Let A be an n×n matrix (a linear transformation). An eigenvector of A is a nonzero vector v such that
Av = λv
where λ is a scalar. The scalar λ is the eigenvalue corresponding to v.
Interpretation:
If λ > 1, vectors along that eigenvector direction grow.
If 0 < λ < 1, they shrink.
If λ = 0, that direction is sent to the zero vector.
If λ < 0, the direction is flipped and scaled.
If v = 0, then Av = 0 for any A, and you could write A0 = λ0 for any λ. That would make the definition meaningless. Requiring v ≠ 0 forces the relationship to describe a genuine property of the transformation.
Think of A acting on the unit circle in 2D:
If you draw the line through the origin along v, then A maps that entire line back onto itself (possibly reversed). The action on that line is simply “multiply by λ.”
Not every matrix has enough eigenvectors to form a basis. Some matrices have:
This lesson focuses on: (1) how to compute eigenvalues/eigenvectors, (2) what they mean, and (3) why they matter for applications.
The eigenvector equation is:
Av = λv
Bring everything to one side:
Av − λv = 0
Factor v using the identity matrix I (since λv = (λI)v):
(A − λI)v = 0
This is a homogeneous linear system. You already know a key fact about such systems:
So λ must satisfy:
det(A − λI) = 0
This equation is called the characteristic equation, and det(A − λI) is the characteristic polynomial.
Start:
Av = λv
Subtract λv:
Av − λv = 0
Rewrite λv as (λI)v:
Av − (λI)v = 0
Factor out v:
(A − λI)v = 0
For v ≠ 0 to exist, we need:
(A − λI) is singular
Equivalent to:
det(A − λI) = 0
1) Form A − λI
2) Compute det(A − λI)
3) Solve det(A − λI) = 0 for λ
For 2×2 matrices, this is quick. For 3×3 it’s manageable. For larger matrices, you typically use numerical algorithms.
Let
A = [ a b ]
[ c d ]
Then
A − λI = [ a−λ b ]
[ c d−λ ]
Compute determinant:
det(A − λI) = (a−λ)(d−λ) − bc
= λ² − (a + d)λ + (ad − bc)
So eigenvalues satisfy a quadratic.
For 2×2:
This is not just a coincidence—it generalizes (with more algebra) to higher dimensions via the characteristic polynomial.
det(A − λI) = 0 may have complex roots.
Example: a pure rotation in 2D has no real invariant directions, so it has complex eigenvalues/eigenvectors (over ℂ).
For many applications in ML and graphs, the matrices are symmetric (or related to symmetric), and then all eigenvalues are real and eigenvectors can be chosen orthonormal. But you should know that in general eigenvalues can be complex.
Once you have an eigenvalue λ, the eigenvector equation becomes:
(A − λI)v = 0
This is a homogeneous system. The set of all solutions is the null space (kernel) of (A − λI). Any nonzero vector in that null space is an eigenvector.
Because you already know Gaussian elimination, the workflow is familiar:
1) Plug in λ
2) Row-reduce (A − λI)
3) Express solutions with free variables
4) Pick any nonzero solution vector
If v is an eigenvector, then any nonzero scalar multiple αv is also an eigenvector with the same eigenvalue:
A(αv) = αAv = α(λv) = λ(αv)
So eigenvectors are not “unique”; they represent a direction. More generally, if an eigenvalue has multiple independent eigenvectors, they form an eigenspace, which is a subspace.
When det(A − λI) = 0, an eigenvalue might repeat.
You always have:
1 ≤ geometric multiplicity ≤ algebraic multiplicity
If geometric multiplicity is smaller, the matrix may not have enough eigenvectors to diagonalize.
If A has n linearly independent eigenvectors v₁,…,vₙ with eigenvalues λ₁,…,λₙ, form the matrix:
V = [ v₁ v₂ … vₙ ]
Then:
AV = A[v₁ … vₙ]
= [Av₁ … Avₙ]
= [λ₁v₁ … λₙvₙ]
= [v₁ … vₙ] diag(λ₁,…,λₙ)
= VΛ
So:
AV = VΛ
If V is invertible (i.e., eigenvectors are independent), multiply by V⁻¹:
A = VΛV⁻¹
This is diagonalization: in the eigenvector basis, A acts like independent scaling.
If A = VΛV⁻¹, then
A² = (VΛV⁻¹)(VΛV⁻¹) = VΛ²V⁻¹
Aᵏ = VΛᵏV⁻¹
Since Λ is diagonal, Λᵏ just raises each eigenvalue:
Λᵏ = diag(λ₁ᵏ, …, λₙᵏ)
This is the core reason eigenvalues matter for long-term behavior.
Many systems update a state by multiplying by a matrix:
xₖ₊₁ = Axₖ
If you can express x₀ in an eigenvector basis, you can see which components grow, shrink, oscillate, or remain fixed.
A Markov chain with n states often uses a transition matrix P where (depending on convention) columns or rows sum to 1. A stationary distribution π satisfies:
Pπ = π
That is exactly an eigenvector equation with eigenvalue λ = 1:
Pπ = 1·π
So π is an eigenvector of P corresponding to eigenvalue 1.
Why λ = 1 matters:
SVD is A = UΣVᵀ. While eigenvalues are not the same as singular values, there is a deep link:
AᵀA v = σ² v
So learning eigenvalues now sets you up to understand SVD later.
Graphs can be encoded by matrices:
Eigenvalues of L tell you about:
The reason: many graph processes are linear updates (random walks, diffusion), and eigen-decompositions reveal the “modes” of the network.
In continuous-time linear systems:
dx/dt = Ax
Solutions involve e^{tA}. Eigenvalues of A determine whether trajectories grow or decay. Roughly:
Even if you don’t compute e^{tA} directly, eigenvalues summarize stability.
| Problem type | Matrix equation | Eigenvalue role | What you learn |
|---|---|---|---|
| Long-run discrete dynamics | xₖ₊₁ = Axₖ | magnitudes | growth/decay per mode |
| Stationary distributions | Pπ = π | λ = 1 | invariant distribution |
| PCA/SVD prep | AᵀA v = σ² v | σ² are eigenvalues | variance directions |
| Graph connectivity | Lv = λv | small λ | components/clusters |
The common thread: eigenvectors give “modes,” eigenvalues give their strength.
Let A = [[2, 1], [1, 2]]. Find eigenvalues λ and eigenvectors v.
Start from det(A − λI) = 0.
A − λI = [[2−λ, 1], [1, 2−λ]].
Compute determinant:
det(A − λI) = (2−λ)(2−λ) − 1·1
= (2−λ)² − 1
= (4 − 4λ + λ²) − 1
= λ² − 4λ + 3.
Solve λ² − 4λ + 3 = 0:
(λ − 1)(λ − 3) = 0
So eigenvalues are λ₁ = 1 and λ₂ = 3.
Find eigenvectors for λ = 3:
A − 3I = [[2−3, 1], [1, 2−3]] = [[−1, 1], [1, −1]].
Solve (A − 3I)v = 0.
Let v = (x, y)ᵀ.
Equations: −x + y = 0 ⇒ y = x.
So eigenvectors are all nonzero multiples of (1, 1)ᵀ.
Find eigenvectors for λ = 1:
A − I = [[1, 1], [1, 1]].
Solve (A − I)v = 0.
Equations: x + y = 0 ⇒ y = −x.
So eigenvectors are all nonzero multiples of (1, −1)ᵀ.
Insight: This symmetric matrix stretches space by 3 along the (1,1) direction and by 1 along the (1,−1) direction. In the eigenvector basis, the transformation becomes diagonal: independent scaling along two perpendicular directions.
Using the same A = [[2,1],[1,2]], compute Aᵏ acting on x₀ = (1,0)ᵀ, i.e., xₖ = Aᵏx₀.
From Example 1, eigenpairs are:
λ₁ = 3 with eigenvector v₁ = (1,1)ᵀ,
λ₂ = 1 with eigenvector v₂ = (1,−1)ᵀ.
Form V and Λ:
V = [v₁ v₂] = [[1, 1], [1, −1]].
Λ = diag(3, 1).
Express x₀ in the eigenvector basis:
We want coefficients a, b such that
x₀ = av₁ + bv₂.
So (1,0)ᵀ = a(1,1)ᵀ + b(1,−1)ᵀ
= (a+b, a−b)ᵀ.
Solve for a, b:
a + b = 1
a − b = 0
Add equations: 2a = 1 ⇒ a = 1/2.
Then b = 1 − a = 1/2.
Apply Aᵏ using eigenvalues:
Aᵏx₀ = aλ₁ᵏv₁ + bλ₂ᵏv₂
= (1/2)·3ᵏ(1,1)ᵀ + (1/2)·1ᵏ(1,−1)ᵀ.
Write components explicitly:
xₖ = ( (1/2)(3ᵏ + 1), (1/2)(3ᵏ − 1) )ᵀ.
Insight: Repeated application of A amplifies the component along the dominant eigenvector (1,1) by 3ᵏ, while the component along (1,−1) stays unchanged (since λ = 1). This is the basic ‘dominant eigenvalue’ phenomenon behind power iteration and Markov chain convergence behavior.
Let A = [[4, 0], [0, 4]]. Find eigenvalues/eigenvectors and interpret.
Compute A − λI = [[4−λ, 0], [0, 4−λ]].
det(A − λI) = (4−λ)(4−λ) = (4−λ)².
So the only eigenvalue is λ = 4 with algebraic multiplicity 2.
Solve (A − 4I)v = 0:
A − 4I = [[0, 0], [0, 0]].
So 0·x + 0·y = 0 gives no constraints.
Therefore every nonzero vector v ∈ ℝ² is an eigenvector:
Av = 4v for all v.
Insight: When A is 4I, the transformation is uniform scaling by 4 in every direction. The eigenspace for λ = 4 is the entire plane (geometric multiplicity 2). This contrasts with matrices that have a repeated eigenvalue but not enough independent eigenvectors.
Eigenvectors v satisfy Av = λv with v ≠ 0; they represent invariant directions of a linear transformation.
Eigenvalues λ are found by turning the eigenvector condition into a singularity condition: (A − λI)v = 0 has nontrivial solutions iff det(A − λI) = 0.
After finding λ, eigenvectors come from solving the homogeneous system (A − λI)v = 0 via Gaussian elimination.
Eigenvectors are not unique: any nonzero scalar multiple of an eigenvector is also an eigenvector; eigenvectors for a given λ form an eigenspace (a subspace).
If A has n independent eigenvectors, it diagonalizes as A = VΛV⁻¹, making Aᵏ = VΛᵏV⁻¹ easy to compute.
In Markov chains, stationary distributions are eigenvectors with eigenvalue 1 (Pπ = π).
Eigenvalues summarize long-term behavior: large |λ| modes dominate; |λ| < 1 modes decay; negative or complex eigenvalues introduce sign flips or oscillations.
Forgetting the identity matrix: writing A − λ instead of A − λI, which breaks dimensions and the algebra.
Including v = 0 as an eigenvector (it is never allowed).
Solving det(A − λI) = 0 correctly but then plugging λ back into Av = λv without rearranging to a solvable system (A − λI)v = 0.
Assuming every matrix is diagonalizable or has real eigenvalues; rotations and defective matrices are important counterexamples.
Find the eigenvalues of A = [[5, 2], [0, 1]].
Hint: Compute det(A − λI). For an upper triangular matrix, the determinant is the product of diagonal entries.
A − λI = [[5−λ, 2], [0, 1−λ]].
det(A − λI) = (5−λ)(1−λ).
Set equal to 0:
(5−λ)(1−λ) = 0 ⇒ λ ∈ {5, 1}.
For A = [[5, 2], [0, 1]], find an eigenvector for each eigenvalue.
Hint: Solve (A − λI)v = 0 separately for λ = 5 and λ = 1. Use a free variable.
For λ = 5:
A − 5I = [[0, 2], [0, −4]].
Solve [[0,2],[0,−4]] (x,y)ᵀ = (0,0)ᵀ.
Equations: 2y = 0 ⇒ y = 0. x is free.
Pick x = 1 ⇒ v = (1,0)ᵀ.
For λ = 1:
A − I = [[4, 2], [0, 0]].
Solve 4x + 2y = 0 ⇒ 2x + y = 0 ⇒ y = −2x.
Pick x = 1 ⇒ v = (1,−2)ᵀ.
Let A = [[2, 1], [0, 2]]. (a) Find its eigenvalues. (b) Find the eigenspace for λ = 2. (c) Does A have two independent eigenvectors?
Hint: You will get a repeated eigenvalue. The key is the null space of (A − 2I). Compare algebraic vs geometric multiplicity.
(a) A − λI = [[2−λ, 1], [0, 2−λ]].
Determinant: det(A − λI) = (2−λ)(2−λ) = (2−λ)².
So λ = 2 with algebraic multiplicity 2.
(b) For λ = 2:
A − 2I = [[0, 1], [0, 0]].
Solve [[0,1],[0,0]](x,y)ᵀ = (0,0)ᵀ ⇒ y = 0, x free.
So eigenspace = { (x,0)ᵀ : x ∈ ℝ } = span{(1,0)ᵀ}.
(c) No. The eigenspace is 1-dimensional, so there is only 1 independent eigenvector even though the eigenvalue repeats. Therefore A is not diagonalizable over ℝ.
Next nodes you can unlock and why they depend on eigenvalues:
Helpful supporting knowledge: