Perpendicular vectors. Orthogonal and orthonormal bases.
Self-serve tutorial - low prerequisites, straightforward concepts.
Orthogonality is the linear-algebra version of “independent directions.” When vectors are perpendicular, their interaction through the dot product vanishes—and that simple fact unlocks clean geometry, stable computations, and powerful decompositions like projections and the SVD.
Two vectors are orthogonal iff a·b = 0. An orthonormal set is orthogonal + unit length, giving qᵢ·qⱼ = δᵢⱼ. Orthonormal bases make coordinates, lengths, and projections dramatically simpler and more numerically stable.
In many problems, you want to separate “signal” from “noise,” or split a space into independent directions. Orthogonality gives you a precise way to do that.
If two directions are orthogonal, then moving along one does not change how far you’ve moved along the other. That idea shows up everywhere:
For vectors a, b ∈ ℝⁿ:
a·b = 0.
This matches the geometric meaning from the dot product:
a·b = ‖a‖ ‖b‖ cos θ.
So if a·b = 0 and both vectors are nonzero, then cos θ = 0 ⇒ θ = 90°.
If one of the vectors is 0, then 0·b = 0 for all b. So 0 is orthogonal to every vector (but it is not a useful “direction”).
A set of vectors {v₁, …, vₖ} is:
When a set is orthonormal, the dot products collapse into a single compact rule using the Kronecker delta δᵢⱼ:
qᵢ·qⱼ = δᵢⱼ,
where δᵢⱼ = 1 if i = j, else 0.
An orthonormal basis is an orthonormal set that is also a basis (spans the space and has the right number of vectors).
Why do we care? Because orthonormal bases make coordinates and computations “diagonal”—cross terms disappear.
In ℝ², the standard basis vectors
are orthonormal.
Any vector x = (x₁, x₂) can be written as
x = x₁ e₁ + x₂ e₂,
and those coefficients are simply dot products:
This “dot product gives coordinates” property generalizes to any orthonormal basis.
Angles are geometric, but dot products are algebraic. In ℝⁿ you can’t easily visualize 90°—but you can always compute a·b.
Orthogonality turns into a simple computation:
a ⟂ b ⇔ a·b = 0.
Think of a·b as “how much a points in the direction of b (scaled by ‖b‖).”
That last statement becomes exact once you learn projections: the projection of a onto b is proportional to a·b.
This is a key structural fact:
If {v₁, …, vₖ} is an orthogonal set of nonzero vectors, then it is linearly independent.
Assume
c₁v₁ + c₂v₂ + … + cₖvₖ = 0.
Dot both sides with vⱼ:
(vⱼ)·(c₁v₁ + … + cₖvₖ) = (vⱼ)·0.
Use linearity of dot product:
c₁(vⱼ·v₁) + … + cₖ(vⱼ·vₖ) = 0.
But orthogonality means vⱼ·vᵢ = 0 for i ≠ j, leaving only one term:
cⱼ(vⱼ·vⱼ) = 0.
And vⱼ·vⱼ = ‖vⱼ‖² > 0 since vⱼ ≠ 0.
So cⱼ = 0.
Since this holds for every j, all coefficients are zero ⇒ independence.
If you can produce k mutually orthogonal nonzero vectors in an n-dimensional space, you immediately know:
Given a nonzero vector v, define its normalized version:
q = v / ‖v‖.
Then ‖q‖ = 1.
If you start with an orthogonal set {vᵢ}, normalization preserves orthogonality:
(vᵢ/‖vᵢ‖) · (vⱼ/‖vⱼ‖)
= (vᵢ·vⱼ) / (‖vᵢ‖‖vⱼ‖)
= 0 for i ≠ j.
So orthogonal + normalize each vector ⇒ orthonormal.
| Concept | Condition | What you gain |
|---|---|---|
| Orthogonal vectors a, b | a·b = 0 | Perpendicular directions, no “overlap” |
| Orthogonal set {vᵢ} | vᵢ·vⱼ = 0 for i ≠ j | Guaranteed linear independence (if nonzero) |
| Orthonormal set {qᵢ} | qᵢ·qⱼ = δᵢⱼ | Simplest possible coordinate math |
| Orthonormal basis | Orthonormal set that spans | Lengths/angles preserved under coordinate transforms |
In an orthonormal set, dot products behave like an identity matrix:
qᵢ·qⱼ = δᵢⱼ.
This is the inner-product equivalent of saying: “basis directions don’t interact.”
Soon, when you represent a basis as a matrix Q with columns qᵢ, you’ll see:
QᵀQ = I,
and δᵢⱼ are exactly the entries of I.
In a general basis {b₁, …, bₙ}, finding coordinates means solving a linear system. The basis vectors might not be perpendicular, so contributions “mix.”
In an orthonormal basis {q₁, …, qₙ}, coordinates fall out by dot products.
Suppose {q₁, …, qₙ} is an orthonormal basis and x ∈ ℝⁿ.
Write
x = ∑ᵢ αᵢ qᵢ.
Dot both sides with qⱼ:
qⱼ·x = qⱼ·(∑ᵢ αᵢ qᵢ)
= ∑ᵢ αᵢ (qⱼ·qᵢ)
= ∑ᵢ αᵢ δⱼᵢ
= αⱼ.
So the coefficient is simply:
αⱼ = qⱼ·x.
This is one of the biggest “payoffs” in linear algebra: orthonormal bases turn coordinate-finding into dot products.
If a ⟂ b, then
‖a + b‖² = ‖a‖² + ‖b‖².
‖a + b‖² = (a + b)·(a + b)
= a·a + a·b + b·a + b·b
= ‖a‖² + 0 + 0 + ‖b‖².
This generalizes: if {vᵢ} is an orthogonal set, then
‖∑ᵢ vᵢ‖² = ∑ᵢ ‖vᵢ‖².
If x = ∑ᵢ αᵢ qᵢ in an orthonormal basis, then
‖x‖² = ∑ᵢ αᵢ².
‖x‖² = x·x
= (∑ᵢ αᵢ qᵢ) · (∑ⱼ αⱼ qⱼ)
= ∑ᵢ ∑ⱼ αᵢ αⱼ (qᵢ·qⱼ)
= ∑ᵢ ∑ⱼ αᵢ αⱼ δᵢⱼ
= ∑ᵢ αᵢ².
So an orthonormal basis makes length computation look like ordinary Euclidean length of the coordinate vector (α₁, …, αₙ).
If {vᵢ} is orthogonal but not normalized, you still get simplification, but coefficients scale:
If x = ∑ᵢ cᵢ vᵢ with orthogonal {vᵢ}, then dot with vⱼ:
vⱼ·x = ∑ᵢ cᵢ (vⱼ·vᵢ) = cⱼ (vⱼ·vⱼ) = cⱼ ‖vⱼ‖².
So
cⱼ = (vⱼ·x) / ‖vⱼ‖².
With orthonormal vectors ‖qⱼ‖² = 1, giving cⱼ = qⱼ·x.
Put orthonormal vectors as columns of a matrix Q:
Q = [ q₁ q₂ … qₙ ].
Then orthonormality implies
QᵀQ = I.
If Q is square (n columns in ℝⁿ), then Q is an orthogonal matrix, and also
QQᵀ = I and Q⁻¹ = Qᵀ.
Interpretation: Q represents a rotation/reflection that preserves dot products and lengths:
(Qx)·(Qy) = x·y, and ‖Qx‖ = ‖x‖.
That preservation property is why orthonormal bases are the backbone of stable numerical algorithms.
A projection is about decomposing a vector into:
Orthogonality is the condition that makes the “leftover” the smallest possible error.
Let u be a nonzero vector. The projection of x onto span{u} is
projᵤ(x) = ((x·u) / (u·u)) u.
If u is unit length (‖u‖ = 1), this simplifies to
projᵤ(x) = (x·u) u.
The residual r = x − projᵤ(x) is orthogonal to u:
u·r = 0.
That “residual is orthogonal” fact is the geometric certificate that you chose the best approximation along u.
Suppose a subspace W has an orthonormal basis {q₁, …, qₖ}. Then the projection onto W is simply the sum of projections onto each basis vector:
proj_W(x) = ∑ᵢ (x·qᵢ) qᵢ.
No linear system. No matrix inversion. Just dot products.
In matrix form, with Q = [q₁ … qₖ] (n×k, columns orthonormal):
proj_W(x) = QQᵀ x.
Why does this work? Because QᵀQ = Iₖ, so the coordinates of x in that basis are Qᵀx.
In least squares, you try to approximate b by Ax where A’s columns span some subspace (the column space of A). The best-fit residual
r = b − Ax̂
satisfies an orthogonality condition:
Aᵀ r = 0.
That means the error is orthogonal to every column of A. Conceptually: the best approximation is the projection of b onto col(A).
When A has orthonormal columns (A = Q), the solution becomes especially simple:
x̂ = Qᵀ b.
This is a major reason algorithms try to convert general matrices into orthonormal factors.
The singular value decomposition
A = U Σ Vᵀ
uses U and V with orthonormal columns (often orthogonal matrices). That means:
Because these bases are orthonormal, Σ cleanly scales independent directions without mixing them.
In other words, orthogonality is what makes SVD a “diagonalization-like” factorization even when A is not square.
Orthogonality is not just geometry; it’s a computational strategy:
Let a = (2, −1, 0) and b = (1, 2, 0) in ℝ³. (1) Are they orthogonal? (2) Create an orthonormal set from them.
Compute the dot product:
a·b = 2·1 + (−1)·2 + 0·0
= 2 − 2 + 0
= 0.
Since a·b = 0 and both vectors are nonzero, a ⟂ b.
Compute norms:
‖a‖ = √(2² + (−1)² + 0²) = √(4 + 1) = √5.
‖b‖ = √(1² + 2² + 0²) = √(1 + 4) = √5.
Normalize each:
q₁ = a/‖a‖ = (2, −1, 0)/√5.
q₂ = b/‖b‖ = (1, 2, 0)/√5.
Verify orthonormality quickly:
q₁·q₂ = (a·b)/(‖a‖‖b‖) = 0/(√5·√5) = 0.
‖q₁‖ = ‖q₂‖ = 1.
Insight: Orthogonality is scale-invariant: multiplying a vector by a scalar doesn’t change perpendicularity. Orthonormality is just orthogonality plus the convenience of unit length.
Let q₁ = (1/√2, 1/√2) and q₂ = (1/√2, −1/√2). These form an orthonormal basis of ℝ². Express x = (3, 1) as x = α₁q₁ + α₂q₂.
Use orthonormal coordinate extraction: αⱼ = qⱼ·x.
Compute α₁:
α₁ = q₁·x
= (1/√2, 1/√2)·(3, 1)
= (1/√2)·3 + (1/√2)·1
= 4/√2
= 2√2.
Compute α₂:
α₂ = q₂·x
= (1/√2, −1/√2)·(3, 1)
= (1/√2)·3 + (−1/√2)·1
= 2/√2
= √2.
Reconstruct to check:
α₁q₁ + α₂q₂
= (2√2)(1/√2, 1/√2) + (√2)(1/√2, −1/√2)
= (2, 2) + (1, −1)
= (3, 1) = x.
Insight: Because qᵢ·qⱼ = δᵢⱼ, dotting with qⱼ “selects” only the αⱼ coefficient—exactly like how multiplying by the identity matrix selects components.
In ℝ², let u = (1, 0) and v = (0, 2). Let x = (3, 2). (1) Write x as a sum of a component along u and a component along v. (2) Verify ‖x‖² equals the sum of squared component lengths.
Check orthogonality:
u·v = (1,0)·(0,2) = 0, so u ⟂ v.
Find coefficients in an orthogonal (not orthonormal) basis using
cⱼ = (vⱼ·x) / ‖vⱼ‖².
Component along u:
‖u‖² = 1.
u·x = (1,0)·(3,2) = 3.
So c₁ = 3/1 = 3 and the component is 3u = (3,0).
Component along v:
‖v‖² = 0² + 2² = 4.
v·x = (0,2)·(3,2) = 4.
So c₂ = 4/4 = 1 and the component is 1v = (0,2).
Thus x = (3,0) + (0,2).
Verify Pythagorean relation:
‖x‖² = 3² + 2² = 13.
‖(3,0)‖² + ‖(0,2)‖² = 9 + 4 = 13.
Insight: Orthogonality is what makes energy add: squared length of a sum becomes a sum of squared lengths. Without orthogonality, cross terms appear.
Orthogonality is defined algebraically: a ⟂ b ⇔ a·b = 0 (for nonzero vectors, this means a 90° angle).
An orthogonal set of nonzero vectors is automatically linearly independent.
Orthonormal means orthogonal + unit length, summarized by qᵢ·qⱼ = δᵢⱼ.
In an orthonormal basis, coordinates are αⱼ = qⱼ·x (dot products directly give coefficients).
Orthogonality yields the Pythagorean identity: if a ⟂ b, then ‖a + b‖² = ‖a‖² + ‖b‖².
With orthogonal (not normalized) basis vectors {vᵢ}, coefficients are cⱼ = (vⱼ·x) / ‖vⱼ‖².
Matrices with orthonormal columns satisfy QᵀQ = I; if square, Q⁻¹ = Qᵀ and lengths/dot products are preserved.
Orthogonality is the backbone of projections, least squares (orthogonal residual), and decompositions like SVD.
Assuming a·b = 0 implies one vector is zero; in fact it usually means they’re perpendicular (unless one is 0).
Confusing “orthogonal” with “orthonormal”: orthogonal vectors can have any length; orthonormal vectors must have length 1.
Using αⱼ = vⱼ·x for an orthogonal-but-not-normalized basis; the correct formula divides by ‖vⱼ‖².
Believing any set of n orthogonal vectors in ℝⁿ is automatically a basis even if one vector is 0 (nonzero is required).
Let a = (1, 2, 2) and b = (2, −1, 0). Are they orthogonal? If not, compute a·b and interpret the sign.
Hint: Compute the dot product and check whether it equals 0. If it’s positive, the angle is acute; if negative, obtuse.
a·b = 1·2 + 2·(−1) + 2·0 = 2 − 2 + 0 = 0. So they are orthogonal (perpendicular).
Given an orthogonal basis {v₁, v₂} in ℝ² with v₁ = (3, 0) and v₂ = (0, 4), write x = (6, 8) as x = c₁v₁ + c₂v₂.
Hint: Use cⱼ = (vⱼ·x) / ‖vⱼ‖² for orthogonal (not orthonormal) vectors.
Compute c₁:
v₁·x = (3,0)·(6,8) = 18.
‖v₁‖² = 3² = 9.
So c₁ = 18/9 = 2.
Compute c₂:
v₂·x = (0,4)·(6,8) = 32.
‖v₂‖² = 4² = 16.
So c₂ = 32/16 = 2.
Thus x = 2v₁ + 2v₂ = 2(3,0) + 2(0,4) = (6,8).
Let Q be a 3×3 matrix whose columns q₁, q₂, q₃ are orthonormal. Show that ‖Qx‖ = ‖x‖ for all x ∈ ℝ³.
Hint: Start from ‖Qx‖² = (Qx)·(Qx) and rewrite using transposes: (Qx)·(Qx) = xᵀ(QᵀQ)x.
Since columns are orthonormal, QᵀQ = I.
Compute squared norm:
‖Qx‖² = (Qx)·(Qx)
= (Qx)ᵀ(Qx)
= xᵀ Qᵀ Q x
= xᵀ I x
= xᵀx
= ‖x‖².
Because norms are nonnegative, taking √ of both sides gives ‖Qx‖ = ‖x‖.
Next nodes:
Related refreshers: