If n+1 items in n boxes, at least one box has 2+ items.
Deep-dive lesson - accessible entry point but dense material. Use worked examples and spaced repetition.
You don’t need probability, algebra, or heavy machinery to prove some surprisingly strong statements—sometimes a simple counting mismatch forces a guaranteed collision. The pigeonhole principle is the canonical example: if you try to squeeze “too many” items into “too few” boxes, at least one box must get crowded.
Model a situation as a function f: Items → Boxes where each item goes to exactly one box. If there are more items than boxes, then f cannot be one-to-one, so ∃ a box containing ≥ 2 items. More generally, with m items and n boxes, some box has at least ⌈m/n⌉ items.
In discrete math, many problems look complicated because they describe real-world objects (people, socks, birthdays, edges in a graph, remainders mod n) rather than clean numbers. The pigeonhole principle is a translation tool: it turns a messy statement into a statement about counting.
The core motivation is simple:
then crowding is unavoidable.
This is not about “likely” outcomes; it’s a certainty.
If n + 1 items are placed into n boxes, then at least one box contains at least 2 items.
Here:
A clean way to express “each item goes to exactly one box” is with a function.
Let:
Then for each item i ∈ I, f(i) is the box it goes into.
If m > n, then f cannot be injective (one-to-one). That means:
∃ i₁, i₂ ∈ I with i₁ ≠ i₂ and f(i₁) = f(i₂)
That equality means two distinct items land in the same box: a collision.
If m > n and f: I → B, then ∃ b ∈ B such that |f⁻¹({b})| ≥ 2.
This reads: there exists a box b whose preimage has at least 2 items.
Pigeonhole arguments are powerful because they:
In many contest and interview settings, the hardest part is not the counting—it’s choosing the right “boxes.” That design choice is the art of pigeonhole proofs.
The simplest pigeonhole principle is the seed from which most applications grow. If you can recognize a situation where “more than n items are forced into n categories,” you immediately get a guaranteed repetition.
If n + 1 items are placed into n boxes, then at least one box contains at least 2 items.
Assume for contradiction that no box contains ≥ 2 items. Then every box contains at most 1 item.
If there are n boxes, the maximum number of items you could place is:
1 + 1 + … + 1 (n times) = n
But we are told there are n + 1 items. So we get:
n + 1 ≤ n
which is impossible.
Therefore, the assumption was false, so at least one box contains ≥ 2 items.
In practice, the statement “items” and “boxes” is metaphorical. You decide what they are.
Common choices for boxes:
What makes a good box choice?
| Viewpoint | What you track | What you conclude | When it helps |
|---|---|---|---|
| “Physical bins” | literal distribution | some bin has ≥ 2 | intuitive starts |
| Function f: Items → Boxes | injective vs not | collision f(i₁)=f(i₂) | clean proofs |
| Counting upper bound | max items with constraint | contradiction | many olympiad proofs |
It does not tell you:
It only guarantees existence.
This is a feature: you often only need “there exists …” to finish a proof.
The basic version answers: “if m = n + 1, then some box has 2.”
But many problems naturally have:
and you want a guaranteed minimum crowding level. The generalized pigeonhole principle gives a tight bound.
If m items are placed into n boxes, then at least one box contains at least ⌈m/n⌉ items.
Here ⌈x⌉ is the ceiling function: the smallest integer ≥ x.
If you spread m items as evenly as possible over n boxes, you aim for about m/n items per box.
So the maximum load is at least the ceiling of the average.
Assume for contradiction that every box contains at most ⌈m/n⌉ − 1 items.
Then the total number of items is at most:
n(⌈m/n⌉ − 1)
We now show this is < m.
Because ⌈m/n⌉ is the smallest integer ≥ m/n, we know:
⌈m/n⌉ − 1 < m/n
Multiply both sides by n (note n > 0):
n(⌈m/n⌉ − 1) < m
So under our assumption, total items < m, contradicting that there are m items.
Therefore, at least one box has ≥ ⌈m/n⌉ items.
You can’t generally guarantee more than ⌈m/n⌉.
Example: m = 10, n = 3.
You can distribute as 4, 3, 3.
No box has 5, so “≥ 5” would be false as a guarantee. Thus ⌈m/n⌉ is the correct bound.
This is the contrapositive style you’ll use constantly:
If every box has at most k items, then total items m is at most nk.
So if you ever have m > nk, you can instantly conclude:
∃ a box with ≥ k + 1 items.
That is often the quickest way to set up a problem.
1) Force a collision
2) Force a heavy box
Both are the same idea; they just target different conclusions.
A good pigeonhole proof often looks like a trick because it introduces a category system (the boxes) that wasn’t explicitly in the problem statement. Once you see that category system, the rest is mechanical.
So the practical skill is: choose boxes so that sharing a box implies the property you want.
Below are several common application templates.
Remainders partition integers into exactly n boxes:
0, 1, 2, …, n − 1
If two integers land in the same remainder box, their difference is divisible by n.
Formally: if a ≡ b (mod n), then n ∣ (a − b).
To prove: “Among many integers, two differ by a multiple of n.”
If you split a range into n disjoint intervals (“boxes”), any set of n + 1 points must place two points into the same interval.
Then you get a bound on distance.
Example pattern:
This is a deterministic cousin of “points are dense somewhere.”
A set of size k has 2ᵏ subsets. If you can map objects into these subsets (or into bit patterns), you can count boxes via powers of 2.
Common box counts:
This is often used to force two objects to share a pattern.
In graphs, you often classify vertices by some discrete feature:
Then pigeonhole yields repeated degrees, repeated colors, or a crowded color class.
Classic quick example: In any simple graph on n ≥ 2 vertices, two vertices have the same degree.
Notice how the key is not just counting boxes, but sometimes tightening the box count (n − 1 instead of n).
1) Identify the objects you can count (potential items).
2) Invent a classification that is:
3) Count boxes.
4) Compare counts (m vs n, or m vs nk).
5) Translate the collision/crowding back into the problem’s language.
You should reach for pigeonhole when you see phrases like:
Pigeonhole is one of the fastest ways to turn “no matter how” into a proof.
Let n be a positive integer. Given n + 1 integers a₁, a₂, …, aₙ₊₁, prove that there exist i ≠ j such that aᵢ − aⱼ is divisible by n.
Define the boxes as the possible remainders modulo n:
B = {0, 1, 2, …, n − 1}.
There are n boxes.
For each integer a_k, compute its remainder r_k where
r_k ≡ a_k (mod n) and r_k ∈ B.
This assigns each item a_k to exactly one box r_k.
We have n + 1 items (the integers) and n boxes (remainders). By the pigeonhole principle, ∃ i ≠ j such that r_i = r_j.
Translate “same remainder” back into divisibility:
r_i = r_j ⇒ a_i ≡ a_j (mod n)
⇒ n ∣ (a_i − a_j).
Insight: The entire proof is: many integers → few remainders. The “box design” (remainders) is what makes the collision meaningful, because equal remainders are exactly the condition for a difference to be divisible by n.
In a group of 100 people, prove that there is a month in which at least 9 of them have birthdays (ignore leap day issues).
Choose boxes: the 12 months of the year. So n = 12.
Choose items: the 100 people. So m = 100.
Apply the generalized pigeonhole principle:
Some month has at least ⌈m/n⌉ = ⌈100/12⌉ people with birthdays in that month.
Compute the ceiling carefully:
100/12 ≈ 8.333...
⌈100/12⌉ = 9.
Conclude: ∃ a month with ≥ 9 birthdays in the group.
Insight: The ceiling ⌈m/n⌉ is the formal version of “at least the average, rounded up.” Even if birthdays were adversarially arranged, you can’t keep every month at 8 or fewer because 12·8 = 96 < 100.
Pick 11 real numbers in the interval [0, 1]. Prove that there exist two of them whose distance is at most 0.1.
Design boxes by partitioning [0, 1] into 10 disjoint subintervals of equal length 0.1:
[0, 0.1), [0.1, 0.2), …, [0.9, 1].
There are n = 10 boxes.
The 11 real numbers are the items, so m = 11.
By the pigeonhole principle (11 items into 10 boxes), two of the chosen numbers, call them x and y, lie in the same subinterval.
Any two numbers in the same subinterval of length 0.1 differ by at most 0.1:
If x, y ∈ [k/10, (k+1)/10), then
|x − y| ≤ (k+1)/10 − k/10 = 0.1.
Conclude ∃ x ≠ y with |x − y| ≤ 0.1.
Insight: Intervals act like “continuous-looking” pigeonholes. The key is choosing the number of intervals so that ‘same interval’ implies the distance bound you want.
Pigeonhole principle is a counting inevitability: too many items forced into too few boxes ⇒ a collision or crowding exists.
Model assignments as a function f: Items → Boxes; if |Items| > |Boxes| then f is not injective, so ∃ two items in the same box.
Generalized form: with m items and n boxes, some box has at least ⌈m/n⌉ items.
A common contrapositive style: if every box has ≤ k items, then total items m ≤ nk; therefore if m > nk, some box has ≥ k + 1.
Most of the difficulty is choosing the right boxes so that “same box” implies a useful shared property (same remainder, same interval, same category).
Pigeonhole is deterministic (guaranteed), not probabilistic (likely).
It often pairs naturally with proof by contradiction: assume no box is crowded, compute a maximum total, and contradict the given count.
Choosing boxes that are not mutually exclusive (an item could belong to multiple boxes) or not exhaustive (some item fits no box), breaking the function/assignment model.
Forgetting the ceiling in the generalized principle (using m/n instead of ⌈m/n⌉), which can make the bound wrong by 1.
Counting the number of boxes incorrectly (e.g., using n possible degrees in a graph when one degree value is impossible, or double-counting intervals with overlapping endpoints).
Concluding more than pigeonhole guarantees (e.g., claiming two collisions, or identifying which box is crowded) without additional argument.
Show that among any 13 integers, there exist two whose difference is divisible by 12.
Hint: Use remainders modulo 12 as boxes.
Let the 12 boxes be the remainders {0,1,…,11} modulo 12. Each of the 13 integers has exactly one remainder, so we assign each integer to one box. With 13 items and 12 boxes, pigeonhole implies two integers a and b share a remainder: a ≡ b (mod 12). Therefore 12 ∣ (a − b).
You choose 51 numbers from the set {1, 2, …, 100}. Prove that at least one chosen pair sums to 101.
Hint: Design boxes as complementary pairs (1,100), (2,99), …
Partition {1,…,100} into 50 boxes, each box a complementary pair that sums to 101:
{1,100}, {2,99}, …, {50,51}.
There are n = 50 boxes. Selecting 51 numbers means m = 51 items. By pigeonhole, some box contains at least 2 chosen numbers. But each box has exactly 2 numbers and they sum to 101, so at least one chosen pair sums to 101.
In any simple graph with 6 vertices, prove that there exist two vertices with the same degree.
Hint: Possible degrees are 0 through 5, but not all can occur simultaneously. Tighten the box count.
Each vertex has degree in {0,1,2,3,4,5}. However, a graph cannot have both a vertex of degree 0 and a vertex of degree 5: if some vertex is connected to all others (degree 5), then no vertex is isolated (degree 0). So the set of degrees that can appear has size at most 5 (either exclude 0 or exclude 5). Thus we have 6 vertices (items) assigned to at most 5 degree-values (boxes). By pigeonhole, at least two vertices share the same degree.