Strategy profile where no player benefits from unilateral deviation.
Quick unlock - significant prerequisite investment but simple final step. Verify prerequisites first.
In many strategic situations—pricing, traffic routing, bidding, or even choosing technologies—your best move depends on what others do. A Nash equilibrium is the “no-regrets if I alone change my mind” point: a stable strategy profile where each player is already doing a best response to everyone else.
A Nash equilibrium is a strategy profile s = (s₁,…,sₙ) such that for every player i, uᵢ(sᵢ, s₋ᵢ) ≥ uᵢ(sᵢ′, s₋ᵢ) for all alternative strategies sᵢ′. Equivalently, everyone is playing a best response to the others (mutual best responses). It can be pure or mixed; it may be inefficient; and multiple equilibria can exist.
In game theory, you rarely choose an action in isolation. Your outcome depends on others’ choices, and theirs depends on yours. So the usual “maximize my payoff” idea becomes intertwined: your best decision is a function of what you expect others to do.
We want a notion of stability: a predicted outcome that, once reached, doesn’t give any single player an incentive to unilaterally change course. Nash equilibrium formalizes this idea.
A (normal-form) game has:
A strategy profile is an assignment of one strategy to every player:
So we can write payoffs as uᵢ(sᵢ, s₋ᵢ).
A strategy profile s* is a Nash equilibrium if no player can improve their payoff by changing only their own strategy, holding others fixed.
Formally, s* is a Nash equilibrium if ∀i and ∀sᵢ′ ∈ Sᵢ,
uᵢ(sᵢ, s₋ᵢ) ≥ uᵢ(sᵢ′, s₋ᵢ*)
Read it slowly:
A Nash equilibrium is not necessarily:
It’s a self-enforcing outcome: if everyone believes the equilibrium will be played, it is consistent for each person to stick with their equilibrium strategy.
Define player i’s best response set to others’ strategies s₋ᵢ:
BRᵢ(s₋ᵢ) = { sᵢ ∈ Sᵢ : uᵢ(sᵢ, s₋ᵢ) ≥ uᵢ(sᵢ′, s₋ᵢ) for all sᵢ′ ∈ Sᵢ }
Then s* is a Nash equilibrium iff:
sᵢ ∈ BRᵢ(s₋ᵢ) for all i
So a Nash equilibrium is exactly a profile of mutual best responses.
Nash equilibrium does not prevent two or more players from coordinating a deviation that benefits them jointly. It only blocks deviations by a single player at a time.
This matters because many “unstable” or “unreasonable” outcomes are eliminated by unilateral deviations, but some inefficient outcomes remain stable because escaping them requires coordination.
Often, especially in finite games, a pure-strategy equilibrium might not exist. Nash’s major theorem says that if players can randomize (play mixed strategies), at least one Nash equilibrium exists.
We’ll build up to that idea after we understand the mechanics in pure strategies.
The definition is simple, but applying it requires a practical workflow:
In small games (2×2, 2×3), this is often easiest with payoff matrices. In larger games, it resembles an optimization problem: each player solves
maximize over sᵢ ∈ Sᵢ: uᵢ(sᵢ, s₋ᵢ)
For two players (Row and Column), we represent payoffs as ordered pairs.
Example structure:
| C₁ | C₂ | |
|---|---|---|
| R₁ | (a, b) | (c, d) |
| R₂ | (e, f) | (g, h) |
A pure Nash equilibrium is a cell where:
A common manual method:
This is just the definition made visual.
Suppose the strategy sets are numeric (e.g., choose a quantity qᵢ ≥ 0). Then a best response is literally an optimizer:
BRᵢ(q₋ᵢ) = argmax_{qᵢ ≥ 0} uᵢ(qᵢ, q₋ᵢ)
A Nash equilibrium is a fixed point of best responses:
qᵢ ∈ BRᵢ(q₋ᵢ) for all i
This “fixed point of best-response correspondences” viewpoint is the bridge to existence theorems and to computation methods.
Two key phenomena appear immediately:
1) Multiple equilibria
2) No pure equilibrium
This motivates mixed strategies.
A strategy sᵢ is strictly dominated if there exists another strategy tᵢ such that
uᵢ(tᵢ, s₋ᵢ) > uᵢ(sᵢ, s₋ᵢ) for all s₋ᵢ
A strictly dominated strategy is never a best response, so it cannot be used in any Nash equilibrium. Eliminating dominated strategies can simplify equilibrium search.
To avoid confusion, here’s a compact comparison:
| Concept | What can deviate? | Stability claim | Typical use |
|---|---|---|---|
| Nash equilibrium | One player at a time | No profitable unilateral deviation | Baseline prediction in strategic settings |
| Social optimum | A planner chooses | Maximize ∑ᵢ uᵢ(s) | Welfare analysis |
| Pareto efficiency | Any group compares outcomes | No alternative makes someone better without hurting others | Efficiency benchmark |
A Nash equilibrium can be Pareto-inefficient (e.g., Prisoner’s Dilemma). That isn’t a bug—it’s the point: incentives can misalign with collective welfare.
If a game has no pure equilibrium, should we conclude it has no stable outcome? Nash’s insight: allow players to randomize over pure actions.
Randomization can make opponents indifferent among their options, removing profitable deviations.
For player i with k pure strategies, a mixed strategy is a probability vector pᵢ:
pᵢ = (pᵢ1, …, pᵢk) where pᵢj ≥ 0 and ∑ⱼ pᵢj = 1
A mixed-strategy profile is (p₁, …, pₙ).
Payoffs become expected payoffs. In a two-player finite game, the expected payoff to player i is:
E[uᵢ] = ∑_{a∈A} ∑_{b∈B} p₁(a) p₂(b) uᵢ(a,b)
(Generalizes to n players with ∑ over all action profiles.)
A mixed-strategy profile (p₁
,…,pₙ) is a Nash equilibrium if no player can increase expected payoff by changing only their own distribution.
Formally, for every i and every alternative mixed strategy qᵢ,
E[uᵢ(pᵢ, p₋ᵢ)] ≥ E[uᵢ(qᵢ, p₋ᵢ)]
This is the same logic as pure strategies, but with expectations.
If player i mixes among multiple pure strategies with positive probability, then in equilibrium those strategies must yield equal expected payoff (otherwise i would shift probability to the better one).
Let Sᵢ⁺ be the support (strategies played with positive probability). Then for any a, a′ ∈ Sᵢ⁺:
E[uᵢ(a, p₋ᵢ)] = E[uᵢ(a′, p₋ᵢ)]
And any strategy outside the support must do no better:
E[uᵢ(a_out, p₋ᵢ)] ≤ E[uᵢ(a, p₋ᵢ)] for a ∈ Sᵢ⁺
This “equalize payoffs” rule is the workhorse for solving 2×2 mixed equilibria.
Suppose Row has strategies R₁,R₂ and Column has C₁,C₂.
Let Column play C₁ with probability q (and C₂ with 1−q).
Row’s expected payoff from R₁ and R₂ are linear functions of q:
E[u_R(R₁)] = q·u_R(R₁,C₁) + (1−q)·u_R(R₁,C₂)
E[u_R(R₂)] = q·u_R(R₂,C₁) + (1−q)·u_R(R₂,C₂)
If Row mixes between R₁ and R₂ in equilibrium, then set them equal and solve for q.
Similarly, let Row play R₁ with probability p and solve for p by making Column indifferent.
For finite games, Nash (1950–51) proved at least one mixed equilibrium exists. Intuitively:
You don’t need the full proof here, but you should internalize the consequence:
Even when no pure equilibrium exists, the game still has a stable prediction in mixed strategies.
Be careful with interpretation:
In any case, the equilibrium condition remains: no unilateral change improves expected payoff.
In many domains, Nash equilibrium is the default “solution concept”:
The practical value is that it turns interactive decision-making into a set of consistency constraints: everyone’s strategy must be optimal given the others.
A central theme in mechanism design and market design is that Nash equilibria can be inefficient.
Let social welfare be W(s) = ∑ᵢ uᵢ(s).
The gap between equilibrium welfare and optimal welfare motivates:
Mechanism design asks: can we design a game so that desirable outcomes occur in equilibrium?
In a mechanism, the equilibrium concept is often Nash equilibrium (or Bayesian Nash equilibrium when there is private information). The designer wants:
So Nash equilibrium is the “target” condition: the mechanism is successful if the intended outcome is a Nash equilibrium of the induced game.
Zero-sum games are those where u₁(s) = −u₂(s).
For zero-sum games, the Nash equilibrium value (the game value) is stable and can be found via linear programming.
Nash equilibrium is powerful but not the end of the story.
Common limitations:
Still, Nash equilibrium is the foundational equilibrium concept that most refinements build upon.
A good working mental model:
That stable-point idea is what makes Nash equilibrium central across economics, CS, and ML (e.g., multi-agent learning and GAN-like setups).
Two players choose C (Cooperate) or D (Defect). Payoffs (Row, Column):
| C | D | |
|---|---|---|
| C | (3, 3) | (0, 5) |
| D | (5, 0) | (1, 1) |
Find all pure-strategy Nash equilibria and compare to the socially optimal outcome.
Step 1: Compute Row’s best responses.
So BR_R(C)=D and BR_R(D)=D.
Step 2: Compute Column’s best responses.
So BR_C(C)=D and BR_C(D)=D.
Step 3: Identify mutual best responses.
The only profile where both are best-responding is (D, D).
Check the Nash inequalities explicitly:
Therefore (D,D) is a Nash equilibrium.
Step 4: Compare to social welfare.
Compute W(s)=u_R(s)+u_C(s):
So (C,C) is socially better, but it is not a Nash equilibrium because each player can unilaterally deviate to D and increase their own payoff (from 3 to 5).
Insight: Nash equilibrium captures incentive stability, not collective optimality. Prisoner’s Dilemma is the canonical demonstration that individually rational behavior can produce inefficient outcomes.
Two players simultaneously choose H (Heads) or T (Tails). Row wins if the coins match; Column wins if they differ. Payoffs (Row, Column):
| H | T | |
|---|---|---|
| H | (1, −1) | (−1, 1) |
| T | (−1, 1) | (1, −1) |
Find the mixed-strategy Nash equilibrium.
Step 1: Observe no pure equilibrium.
Check each cell for mutual best response:
So (H,H) fails. Similar reasoning rules out all four cells. Hence no pure Nash equilibrium.
Step 2: Let Column mix.
Let q = P(Column plays H). Then P(Column plays T)=1−q.
Compute Row’s expected payoff from each pure action:
E[u_R(H)] = q·u_R(H,H) + (1−q)·u_R(H,T)
= q·1 + (1−q)·(−1)
= q − (1−q)
= 2q − 1
E[u_R(T)] = q·u_R(T,H) + (1−q)·u_R(T,T)
= q·(−1) + (1−q)·1
= −q + 1 − q
= 1 − 2q
Step 3: Make Row indifferent (so Row is willing to mix).
Set E[u_R(H)] = E[u_R(T)]:
2q − 1 = 1 − 2q
2q + 2q = 1 + 1
4q = 2
q = 1/2
So Column must play H with probability 1/2.
Step 4: By symmetry, make Column indifferent.
Let p = P(Row plays H). Then:
E[u_C(H)] = p·u_C(H,H) + (1−p)·u_C(T,H)
= p·(−1) + (1−p)·1
= −p + 1 − p
= 1 − 2p
E[u_C(T)] = p·u_C(H,T) + (1−p)·u_C(T,T)
= p·1 + (1−p)·(−1)
= p − (1−p)
= 2p − 1
Set equal:
1 − 2p = 2p − 1
1 + 1 = 2p + 2p
2 = 4p
p = 1/2
So Row plays H with probability 1/2.
Step 5: State the equilibrium.
Mixed Nash equilibrium:
Expected payoff to Row is 0 (and to Column is 0), consistent with a fair zero-sum game.
Insight: In mixed equilibrium, each player chooses probabilities that make the other player indifferent among their pure actions—removing profitable unilateral deviations.
Two players choose A or B. They prefer to match, but each equilibrium favors a different convention. Payoffs (Row, Column):
| A | B | |
|---|---|---|
| A | (2, 2) | (0, 0) |
| B | (0, 0) | (1, 1) |
Find all pure Nash equilibria and interpret them.
Step 1: Best responses for Row.
Step 2: Best responses for Column (symmetric).
Step 3: Identify mutual best responses.
(A,A) is mutual best response ⇒ Nash equilibrium.
(B,B) is mutual best response ⇒ Nash equilibrium.
The off-diagonal outcomes (A,B) and (B,A) are not equilibria because each player wants to switch to match the other.
Step 4: Interpretation.
There are two stable conventions:
Nash equilibrium alone does not select between them; additional criteria (history, expectations, risk dominance, communication) may matter.
Insight: Multiple Nash equilibria create an equilibrium selection problem: the theory predicts stability, but not necessarily which stable convention a system will coordinate on.
A strategy profile s = (s₁,…,sₙ) assigns one strategy to every player; s₋ᵢ denotes all strategies except player i’s.
Nash equilibrium means: ∀i, no unilateral deviation improves payoff: uᵢ(sᵢ, s₋ᵢ) ≥ uᵢ(sᵢ′, s₋ᵢ*) for all sᵢ′.
Equivalently, each player’s equilibrium strategy is a best response: sᵢ ∈ BRᵢ(s₋ᵢ).
Pure Nash equilibria can be found by checking mutual best responses in payoff matrices; strictly dominated strategies cannot appear in equilibrium.
Some games have no pure Nash equilibrium; allowing mixed strategies (probability distributions) restores existence in finite games.
In mixed equilibrium, any strategy played with positive probability must yield equal expected payoff (indifference over the support).
Nash equilibrium is a stability concept, not an efficiency guarantee—equilibria may be Pareto-inefficient or multiple.
Nash equilibrium is foundational for later topics like mechanism design (designing games with desired equilibria) and zero-sum games (minimax and LP computation).
Confusing Nash equilibrium with “the best outcome” or “social optimum”; Nash is about unilateral incentives, not collective welfare.
Checking only one player’s incentives; an equilibrium requires the no-deviation condition for every player simultaneously.
Assuming mixed strategies mean players are irrational or ‘guessing’; in equilibrium, mixing is an optimal response that removes profitable deviations.
Forgetting that in a mixed equilibrium, only strategies in the support must be equal-payoff; strategies outside the support can be strictly worse.
Consider the game:
| L | R | |
|---|---|---|
| U | (2, 1) | (0, 0) |
| D | (3, 0) | (1, 2) |
Find all pure-strategy Nash equilibria.
Hint: Compute best responses: for each Column choice, which Row action maximizes Row’s payoff? For each Row choice, which Column action maximizes Column’s payoff? Intersections are Nash equilibria.
Row best responses:
So Row best response is always D.
Column best responses:
Mutual best response occurs at (D,R) only.
Check: given R, Row prefers D (1 > 0). Given D, Column prefers R (2 > 0). Thus (D,R) is the unique pure Nash equilibrium.
In Matching Pennies, suppose Row gets payoff +2 when matching and −1 when not matching (Column gets the negative). The payoff matrix (Row, Column) is:
| H | T | |
|---|---|---|
| H | (2, −2) | (−1, 1) |
| T | (−1, 1) | (2, −2) |
Find the mixed-strategy Nash equilibrium probabilities.
Hint: Let q = P(Column plays H). Compute Row’s expected payoff from H vs T and set them equal to solve for q. By symmetry you can solve for p = P(Row plays H) similarly.
Let q = P(Column plays H).
Row’s expected payoff if playing H:
E[u_R(H)] = q·2 + (1−q)·(−1)
= 2q − 1 + q
= 3q − 1
Row’s expected payoff if playing T:
E[u_R(T)] = q·(−1) + (1−q)·2
= −q + 2 − 2q
= 2 − 3q
Set equal for indifference:
3q − 1 = 2 − 3q
3q + 3q = 2 + 1
6q = 3
q = 1/2
So Column plays H with probability 1/2.
Now let p = P(Row plays H). Column’s payoffs are the negative of Row’s, so Column is indifferent when Row makes Row indifferent as well; by symmetry, p = 1/2.
Thus the mixed Nash equilibrium is: Row plays H with 1/2, Column plays H with 1/2.
Show that if a player has a strictly dominant strategy (a strategy that yields higher payoff than any other strategy no matter what others do), then in any Nash equilibrium that player must play a strictly dominant strategy.
Hint: Use the definition of Nash equilibrium: at equilibrium, a player’s chosen strategy must be a best response to s₋ᵢ. Compare the dominant strategy to any alternative at s₋ᵢ.
Let player i have a strictly dominant strategy dᵢ ∈ Sᵢ. By definition, for every alternative sᵢ′ ≠ dᵢ and every s₋ᵢ,
uᵢ(dᵢ, s₋ᵢ) > uᵢ(sᵢ′, s₋ᵢ).
Consider any Nash equilibrium s = (sᵢ, s₋ᵢ). Suppose for contradiction that sᵢ ≠ dᵢ. Then applying strict dominance at s₋ᵢ* gives:
uᵢ(dᵢ, s₋ᵢ) > uᵢ(sᵢ, s₋ᵢ*).
But that means player i can unilaterally deviate from sᵢ* to dᵢ and strictly increase payoff, contradicting the Nash condition:
uᵢ(sᵢ, s₋ᵢ) ≥ uᵢ(sᵢ′, s₋ᵢ*) for all sᵢ′.
Therefore, in any Nash equilibrium, player i must play the strictly dominant strategy dᵢ.