Orthogonality

Linear AlgebraDifficulty: ███░░Depth: 5Unlocks: 9

Perpendicular vectors. Orthogonal and orthonormal bases.

Interactive Visualization

t=0s

Core Concepts

▸Orthogonality: two vectors are perpendicular iff their dot product is zero.
▸Orthonormal set/basis: a set of vectors that are pairwise orthogonal and each has unit length.

Key Symbols & Notation

Kronecker delta (delta_i_j) - equals 1 if i = j, else 0.

Essential Relationships

↔Inner product of orthonormal basis vectors equals the Kronecker delta: e_i dot e_j = delta_i_j.
↔In an orthonormal basis {e_i}, the coordinates of any vector v are c_i = e_i dot v, so v = sum_i c_i e_i.

Prerequisites (2)

Dot Product5 atoms

Basis and Dimension6 atoms

Unlocks (2)

Projectionslvl 3

Singular Value Decompositionlvl 4

▶ Advanced Learning Details

Graph Position

Depth Cost

Fan-Out (ROI)

Bottleneck Score

Chain Length

Cognitive Load

Atomic Elements

Total Elements

Percentile Level

Atomic Level

All Concepts (6)

• Orthogonality: two vectors are perpendicular (a relation between vectors)
• Pairwise (mutual) orthogonality: every distinct pair in a set is orthogonal
• Orthogonal basis: a basis whose vectors are pairwise orthogonal
• Orthonormal vector: a vector that is unit length and part of an orthogonal set
• Orthonormal basis: an orthogonal basis whose vectors all have unit norm
• Unit vector: a vector with length (norm) equal to 1

Teaching Strategy

Self-serve tutorial - low prerequisites, straightforward concepts.

Orthogonality is the linear-algebra version of “independent directions.” When vectors are perpendicular, their interaction through the dot product vanishes—and that simple fact unlocks clean geometry, stable computations, and powerful decompositions like projections and the SVD.

TL;DR:

Two vectors are orthogonal iff a·b = 0. An orthonormal set is orthogonal + unit length, giving qᵢ·qⱼ = δᵢⱼ. Orthonormal bases make coordinates, lengths, and projections dramatically simpler and more numerically stable.

What Is Orthogonality?

Why this concept matters

In many problems, you want to separate “signal” from “noise,” or split a space into independent directions. Orthogonality gives you a precise way to do that.

If two directions are orthogonal, then moving along one does not change how far you’ve moved along the other. That idea shows up everywhere:

•decomposing a vector into components (x-direction vs y-direction)
•measuring energy in independent modes
•building coordinate systems where computations are easy
•ensuring numerical stability in algorithms

Definition (vectors)

For vectors a, b ∈ ℝⁿ:

•a is orthogonal to b (written a ⟂ b) iff

a·b = 0.

This matches the geometric meaning from the dot product:

a·b = ‖a‖ ‖b‖ cos θ.

So if a·b = 0 and both vectors are nonzero, then cos θ = 0 ⇒ θ = 90°.

Important edge case

If one of the vectors is 0, then 0·b = 0 for all b. So 0 is orthogonal to every vector (but it is not a useful “direction”).

Orthogonal sets vs. orthonormal sets

A set of vectors {v₁, …, vₖ} is:

•orthogonal if every pair is perpendicular:
•for i ≠ j, vᵢ·vⱼ = 0
•orthonormal if it is orthogonal and each vector has unit length:
•vᵢ·vⱼ = 0 for i ≠ j
•‖vᵢ‖ = 1 for all i

When a set is orthonormal, the dot products collapse into a single compact rule using the Kronecker delta δᵢⱼ:

qᵢ·qⱼ = δᵢⱼ,

where δᵢⱼ = 1 if i = j, else 0.

Orthonormal bases

An orthonormal basis is an orthonormal set that is also a basis (spans the space and has the right number of vectors).

Why do we care? Because orthonormal bases make coordinates and computations “diagonal”—cross terms disappear.

A geometric picture you should keep

In ℝ², the standard basis vectors

•e₁ = (1, 0)
•e₂ = (0, 1)

are orthonormal.

Any vector x = (x₁, x₂) can be written as

x = x₁ e₁ + x₂ e₂,

and those coefficients are simply dot products:

•x₁ = x·e₁
•x₂ = x·e₂

This “dot product gives coordinates” property generalizes to any orthonormal basis.

Core Mechanic 1: Dot Products as a Test for Perpendicularity (and for Independence)

Why the dot product test is so powerful

Angles are geometric, but dot products are algebraic. In ℝⁿ you can’t easily visualize 90°—but you can always compute a·b.

Orthogonality turns into a simple computation:

a ⟂ b ⇔ a·b = 0.

Worked intuition: “no overlap”

Think of a·b as “how much a points in the direction of b (scaled by ‖b‖).”

•If a·b > 0, they point somewhat the same way.
•If a·b < 0, they oppose.
•If a·b = 0, a has no component along b.

That last statement becomes exact once you learn projections: the projection of a onto b is proportional to a·b.

Orthogonal vectors are automatically linearly independent

This is a key structural fact:

If {v₁, …, vₖ} is an orthogonal set of nonzero vectors, then it is linearly independent.

Why (show the algebra)

Assume

c₁v₁ + c₂v₂ + … + cₖvₖ = 0.

Dot both sides with vⱼ:

(vⱼ)·(c₁v₁ + … + cₖvₖ) = (vⱼ)·0.

Use linearity of dot product:

c₁(vⱼ·v₁) + … + cₖ(vⱼ·vₖ) = 0.

But orthogonality means vⱼ·vᵢ = 0 for i ≠ j, leaving only one term:

cⱼ(vⱼ·vⱼ) = 0.

And vⱼ·vⱼ = ‖vⱼ‖² > 0 since vⱼ ≠ 0.

So cⱼ = 0.

Since this holds for every j, all coefficients are zero ⇒ independence.

Consequence: orthogonal sets are “safe” building blocks

If you can produce k mutually orthogonal nonzero vectors in an n-dimensional space, you immediately know:

•they can serve as part of a basis
•they don’t contain redundancy
•solving for coefficients becomes straightforward

Normalization: turning orthogonal into orthonormal

Given a nonzero vector v, define its normalized version:

q = v / ‖v‖.

Then ‖q‖ = 1.

If you start with an orthogonal set {vᵢ}, normalization preserves orthogonality:

(vᵢ/‖vᵢ‖) · (vⱼ/‖vⱼ‖)

= (vᵢ·vⱼ) / (‖vᵢ‖‖vⱼ‖)

= 0 for i ≠ j.

So orthogonal + normalize each vector ⇒ orthonormal.

A quick comparison table

Concept	Condition	What you gain
Orthogonal vectors a, b	a·b = 0	Perpendicular directions, no “overlap”
Orthogonal set {vᵢ}	vᵢ·vⱼ = 0 for i ≠ j	Guaranteed linear independence (if nonzero)
Orthonormal set {qᵢ}	qᵢ·qⱼ = δᵢⱼ	Simplest possible coordinate math
Orthonormal basis	Orthonormal set that spans	Lengths/angles preserved under coordinate transforms

The Kronecker delta δᵢⱼ (and why it appears)

In an orthonormal set, dot products behave like an identity matrix:

qᵢ·qⱼ = δᵢⱼ.

This is the inner-product equivalent of saying: “basis directions don’t interact.”

Soon, when you represent a basis as a matrix Q with columns qᵢ, you’ll see:

QᵀQ = I,

and δᵢⱼ are exactly the entries of I.

Core Mechanic 2: Orthonormal Bases Make Coordinates and Lengths Simple

Why bases are easier when orthonormal

In a general basis {b₁, …, bₙ}, finding coordinates means solving a linear system. The basis vectors might not be perpendicular, so contributions “mix.”

In an orthonormal basis {q₁, …, qₙ}, coordinates fall out by dot products.

Coordinate extraction in an orthonormal basis

Suppose {q₁, …, qₙ} is an orthonormal basis and x ∈ ℝⁿ.

Write

x = ∑ᵢ αᵢ qᵢ.

Dot both sides with qⱼ:

qⱼ·x = qⱼ·(∑ᵢ αᵢ qᵢ)

= ∑ᵢ αᵢ (qⱼ·qᵢ)

= ∑ᵢ αᵢ δⱼᵢ

= αⱼ.

So the coefficient is simply:

αⱼ = qⱼ·x.

This is one of the biggest “payoffs” in linear algebra: orthonormal bases turn coordinate-finding into dot products.

Pythagorean theorem in ℝⁿ

If a ⟂ b, then

‖a + b‖² = ‖a‖² + ‖b‖².

Proof (show the expansion)

‖a + b‖² = (a + b)·(a + b)

= a·a + a·b + b·a + b·b

= ‖a‖² + 0 + 0 + ‖b‖².

This generalizes: if {vᵢ} is an orthogonal set, then

‖∑ᵢ vᵢ‖² = ∑ᵢ ‖vᵢ‖².

Energy decomposes cleanly in orthonormal coordinates

If x = ∑ᵢ αᵢ qᵢ in an orthonormal basis, then

‖x‖² = ∑ᵢ αᵢ².

Derivation

‖x‖² = x·x

= (∑ᵢ αᵢ qᵢ) · (∑ⱼ αⱼ qⱼ)

= ∑ᵢ ∑ⱼ αᵢ αⱼ (qᵢ·qⱼ)

= ∑ᵢ ∑ⱼ αᵢ αⱼ δᵢⱼ

= ∑ᵢ αᵢ².

So an orthonormal basis makes length computation look like ordinary Euclidean length of the coordinate vector (α₁, …, αₙ).

Orthogonal vs orthonormal bases (why normalizing matters)

If {vᵢ} is orthogonal but not normalized, you still get simplification, but coefficients scale:

If x = ∑ᵢ cᵢ vᵢ with orthogonal {vᵢ}, then dot with vⱼ:

vⱼ·x = ∑ᵢ cᵢ (vⱼ·vᵢ) = cⱼ (vⱼ·vⱼ) = cⱼ ‖vⱼ‖².

cⱼ = (vⱼ·x) / ‖vⱼ‖².

With orthonormal vectors ‖qⱼ‖² = 1, giving cⱼ = qⱼ·x.

Matrix viewpoint (optional but very useful)

Put orthonormal vectors as columns of a matrix Q:

Q = [ q₁ q₂ … qₙ ].

Then orthonormality implies

QᵀQ = I.

•Entry (i, j) of QᵀQ is qᵢ·qⱼ.
•So QᵀQ = I is exactly qᵢ·qⱼ = δᵢⱼ.

If Q is square (n columns in ℝⁿ), then Q is an orthogonal matrix, and also

QQᵀ = I and Q⁻¹ = Qᵀ.

Interpretation: Q represents a rotation/reflection that preserves dot products and lengths:

(Qx)·(Qy) = x·y, and ‖Qx‖ = ‖x‖.

That preservation property is why orthonormal bases are the backbone of stable numerical algorithms.

Application/Connection: Projections, Least Squares, and SVD Readiness

Why orthogonality is the key to projections

A projection is about decomposing a vector into:

•a part that lies in a subspace
•a leftover part orthogonal to that subspace

Orthogonality is the condition that makes the “leftover” the smallest possible error.

Projection onto a single direction

Let u be a nonzero vector. The projection of x onto span{u} is

projᵤ(x) = ((x·u) / (u·u)) u.

If u is unit length (‖u‖ = 1), this simplifies to

projᵤ(x) = (x·u) u.

The residual r = x − projᵤ(x) is orthogonal to u:

u·r = 0.

That “residual is orthogonal” fact is the geometric certificate that you chose the best approximation along u.

Projection onto a subspace with an orthonormal basis

Suppose a subspace W has an orthonormal basis {q₁, …, qₖ}. Then the projection onto W is simply the sum of projections onto each basis vector:

proj_W(x) = ∑ᵢ (x·qᵢ) qᵢ.

No linear system. No matrix inversion. Just dot products.

In matrix form, with Q = [q₁ … qₖ] (n×k, columns orthonormal):

proj_W(x) = QQᵀ x.

Why does this work? Because QᵀQ = Iₖ, so the coordinates of x in that basis are Qᵀx.

Least squares and “orthogonality of the error”

In least squares, you try to approximate b by Ax where A’s columns span some subspace (the column space of A). The best-fit residual

r = b − Ax̂

satisfies an orthogonality condition:

Aᵀ r = 0.

That means the error is orthogonal to every column of A. Conceptually: the best approximation is the projection of b onto col(A).

When A has orthonormal columns (A = Q), the solution becomes especially simple:

x̂ = Qᵀ b.

This is a major reason algorithms try to convert general matrices into orthonormal factors.

SVD readiness: orthonormal bases on both sides

The singular value decomposition

A = U Σ Vᵀ

uses U and V with orthonormal columns (often orthogonal matrices). That means:

•columns of U form an orthonormal basis for output directions
•columns of V form an orthonormal basis for input directions

Because these bases are orthonormal, Σ cleanly scales independent directions without mixing them.

In other words, orthogonality is what makes SVD a “diagonalization-like” factorization even when A is not square.

Summary of the connection

Orthogonality is not just geometry; it’s a computational strategy:

•represent subspaces with orthonormal bases
•use dot products for coordinates
•make projections and least squares stable
•enable decompositions like QR and SVD

Worked Examples (3)

Check orthogonality and normalize to form an orthonormal pair

Let a = (2, −1, 0) and b = (1, 2, 0) in ℝ³. (1) Are they orthogonal? (2) Create an orthonormal set from them.

Compute the dot product:
a·b = 2·1 + (−1)·2 + 0·0
= 2 − 2 + 0
= 0.
Since a·b = 0 and both vectors are nonzero, a ⟂ b.
Compute norms:
‖a‖ = √(2² + (−1)² + 0²) = √(4 + 1) = √5.
‖b‖ = √(1² + 2² + 0²) = √(1 + 4) = √5.
Normalize each:
q₁ = a/‖a‖ = (2, −1, 0)/√5.
q₂ = b/‖b‖ = (1, 2, 0)/√5.
Verify orthonormality quickly:
q₁·q₂ = (a·b)/(‖a‖‖b‖) = 0/(√5·√5) = 0.
‖q₁‖ = ‖q₂‖ = 1.

Insight: Orthogonality is scale-invariant: multiplying a vector by a scalar doesn’t change perpendicularity. Orthonormality is just orthogonality plus the convenience of unit length.

Coordinates in an orthonormal basis via the Kronecker delta idea

Let q₁ = (1/√2, 1/√2) and q₂ = (1/√2, −1/√2). These form an orthonormal basis of ℝ². Express x = (3, 1) as x = α₁q₁ + α₂q₂.

Use orthonormal coordinate extraction: αⱼ = qⱼ·x.
Compute α₁:
α₁ = q₁·x
= (1/√2, 1/√2)·(3, 1)
= (1/√2)·3 + (1/√2)·1
= 4/√2
= 2√2.
Compute α₂:
α₂ = q₂·x
= (1/√2, −1/√2)·(3, 1)
= (1/√2)·3 + (−1/√2)·1
= 2/√2
= √2.
Reconstruct to check:
α₁q₁ + α₂q₂
= (2√2)(1/√2, 1/√2) + (√2)(1/√2, −1/√2)
= (2, 2) + (1, −1)
= (3, 1) = x.

Insight: Because qᵢ·qⱼ = δᵢⱼ, dotting with qⱼ “selects” only the αⱼ coefficient—exactly like how multiplying by the identity matrix selects components.

Decompose a vector into orthogonal components and use Pythagorean theorem

In ℝ², let u = (1, 0) and v = (0, 2). Let x = (3, 2). (1) Write x as a sum of a component along u and a component along v. (2) Verify ‖x‖² equals the sum of squared component lengths.

Check orthogonality:
u·v = (1,0)·(0,2) = 0, so u ⟂ v.
Find coefficients in an orthogonal (not orthonormal) basis using
cⱼ = (vⱼ·x) / ‖vⱼ‖².
Component along u:
‖u‖² = 1.
u·x = (1,0)·(3,2) = 3.
So c₁ = 3/1 = 3 and the component is 3u = (3,0).
Component along v:
‖v‖² = 0² + 2² = 4.
v·x = (0,2)·(3,2) = 4.
So c₂ = 4/4 = 1 and the component is 1v = (0,2).
Thus x = (3,0) + (0,2).
Verify Pythagorean relation:
‖x‖² = 3² + 2² = 13.
‖(3,0)‖² + ‖(0,2)‖² = 9 + 4 = 13.

Insight: Orthogonality is what makes energy add: squared length of a sum becomes a sum of squared lengths. Without orthogonality, cross terms appear.

Key Takeaways

✓
Orthogonality is defined algebraically: a ⟂ b ⇔ a·b = 0 (for nonzero vectors, this means a 90° angle).
✓
An orthogonal set of nonzero vectors is automatically linearly independent.
✓
Orthonormal means orthogonal + unit length, summarized by qᵢ·qⱼ = δᵢⱼ.
✓
In an orthonormal basis, coordinates are αⱼ = qⱼ·x (dot products directly give coefficients).
✓
Orthogonality yields the Pythagorean identity: if a ⟂ b, then ‖a + b‖² = ‖a‖² + ‖b‖².
✓
With orthogonal (not normalized) basis vectors {vᵢ}, coefficients are cⱼ = (vⱼ·x) / ‖vⱼ‖².
✓
Matrices with orthonormal columns satisfy QᵀQ = I; if square, Q⁻¹ = Qᵀ and lengths/dot products are preserved.
✓
Orthogonality is the backbone of projections, least squares (orthogonal residual), and decompositions like SVD.

Common Mistakes

✗
Assuming a·b = 0 implies one vector is zero; in fact it usually means they’re perpendicular (unless one is 0).
✗
Confusing “orthogonal” with “orthonormal”: orthogonal vectors can have any length; orthonormal vectors must have length 1.
✗
Using αⱼ = vⱼ·x for an orthogonal-but-not-normalized basis; the correct formula divides by ‖vⱼ‖².
✗
Believing any set of n orthogonal vectors in ℝⁿ is automatically a basis even if one vector is 0 (nonzero is required).

Practice

easy

Let a = (1, 2, 2) and b = (2, −1, 0). Are they orthogonal? If not, compute a·b and interpret the sign.

Hint: Compute the dot product and check whether it equals 0. If it’s positive, the angle is acute; if negative, obtuse.

Show solution

a·b = 1·2 + 2·(−1) + 2·0 = 2 − 2 + 0 = 0. So they are orthogonal (perpendicular).

medium

Given an orthogonal basis {v₁, v₂} in ℝ² with v₁ = (3, 0) and v₂ = (0, 4), write x = (6, 8) as x = c₁v₁ + c₂v₂.

Hint: Use cⱼ = (vⱼ·x) / ‖vⱼ‖² for orthogonal (not orthonormal) vectors.

Show solution

Compute c₁:

v₁·x = (3,0)·(6,8) = 18.

‖v₁‖² = 3² = 9.

So c₁ = 18/9 = 2.

Compute c₂:

v₂·x = (0,4)·(6,8) = 32.

‖v₂‖² = 4² = 16.

So c₂ = 32/16 = 2.

Thus x = 2v₁ + 2v₂ = 2(3,0) + 2(0,4) = (6,8).

hard

Let Q be a 3×3 matrix whose columns q₁, q₂, q₃ are orthonormal. Show that ‖Qx‖ = ‖x‖ for all x ∈ ℝ³.

Hint: Start from ‖Qx‖² = (Qx)·(Qx) and rewrite using transposes: (Qx)·(Qx) = xᵀ(QᵀQ)x.

Show solution

Since columns are orthonormal, QᵀQ = I.

Compute squared norm:

‖Qx‖² = (Qx)·(Qx)

= (Qx)ᵀ(Qx)

= xᵀ Qᵀ Q x

= xᵀ I x

= xᵀx

= ‖x‖².

Because norms are nonnegative, taking √ of both sides gives ‖Qx‖ = ‖x‖.

Connections

Next nodes:

•Projections — Orthogonality defines the projection residual (the error is ⟂ to the subspace), and orthonormal bases make projections QQᵀx.
•Singular Value Decomposition — U and V in A = UΣVᵀ are orthonormal bases; orthogonality is what prevents direction-mixing and makes Σ act like pure scaling.

Related refreshers:

•Dot Product — Orthogonality is the dot product equaling zero, and dot products encode angles.
•Basis and Dimension — Orthonormal bases are special bases with maximal computational convenience.

Quality: A (4.3/5)

← back to tree browse all →