Reductions proving problems equally hard. SAT, 3-SAT.
Multi-session curriculum - substantial prior knowledge and complex material. Use mastery gates and deliberate practice.
NP-completeness is the craft of explaining, with surgical precision, why a problem is “as hard as” every problem in NP—by converting any NP problem into it using a polynomial-time reduction. Once you can do reductions, you can transfer hardness like a conserved quantity.
To prove a problem X is NP-complete: (1) show X ∈ NP via a polynomial-time verifier, and (2) show NP-hardness by giving a polynomial-time many-one (Karp) reduction A ≤ₚ X from a known NP-complete problem A (often SAT or 3-SAT). A reduction is a poly-time transformation f such that x ∈ A ⇔ f(x) ∈ X.
When you face a new decision problem—say, a scheduling constraint system or a graph property—you want to know whether it is likely to have a polynomial-time algorithm.
For many problems, we do not know how to solve them efficiently, and we also do not know how to prove they’re impossible to solve efficiently (because proving lower bounds is hard). NP-completeness gives a pragmatic middle ground:
NP-completeness is therefore a classification tool: it lets you explain hardness by relating a new problem to a canonical hard problem.
NP-completeness is fundamentally about decision problems, not optimization problems.
This matters because NP is defined via verifiers for yes-instances.
Let X be a decision problem.
X is NP-complete iff both hold:
1) Membership: X ∈ NP
Meaning: there exists a polynomial p(n) and a polynomial-time verifier V such that:
2) Hardness: For every A ∈ NP, A ≤ₚ X
Meaning: every NP problem reduces to X under polynomial-time many-one reductions.
This “hardness” clause is why NP-complete problems are considered the hardest problems in NP.
A problem X is NP-hard if for every A ∈ NP, A ≤ₚ X, but X may not be in NP.
Examples of why “not in NP” can happen:
NP-complete = NP-hard + in NP.
The practical workflow of NP-completeness proofs relies on one key historical theorem:
SAT (Boolean satisfiability) asks: given a Boolean formula φ, is there an assignment to its variables that makes φ true?
From SAT, we get many reductions. Often we use 3-SAT, a restricted form where φ is in CNF (AND of clauses) and each clause has exactly 3 literals.
Why 3-SAT is so useful:
We write A ≤ₚ B to mean “A is polynomial-time many-one reducible to B.”
Interpretation:
So reductions are not just technicalities—they are the directional arrows that carry hardness.
Suppose you have two decision problems A and B. If you can solve B quickly, and you can convert instances of A into instances of B quickly, then you can solve A quickly too.
That is the operational meaning of a reduction: it is a translator from A to B.
In NP-completeness, we use this in contrapositive form:
A ≤ₚ B if there exists a function f computable in polynomial time such that for all instances x:
x ∈ A ⇔ f(x) ∈ B.
Key features:
Think of f as building a “mirror instance” in B that has a YES answer exactly when the original instance in A has a YES answer.
The most common mistake in early NP-completeness proofs is reversing the arrow.
If you want to prove B is NP-hard, you must reduce from a known hard problem A to B:
A ≤ₚ B.
Why? Because this shows: if we could solve B efficiently, we could solve A efficiently.
If you instead show B ≤ₚ A, you only show B is no harder than A (which does not prove hardness of B).
If you have an algorithm ALG_B for B, and a reduction f from A to B, you get an algorithm for A:
Runtime:
So the composition is polynomial.
To prove correctness of a reduction, you almost always prove two implications:
1) (→) If x is YES for A, then f(x) is YES for B.
2) (←) If f(x) is YES for B, then x is YES for A.
Write them explicitly. Don’t rely on intuition.
Many reductions, especially from 3-SAT to graph problems, are built from gadgets:
A gadget is not just a trick: it is the reduction’s way of guaranteeing the equivalence x ∈ A ⇔ f(x) ∈ B.
It can feel paradoxical that 3-SAT is NP-complete even though it’s a special case of SAT.
The key idea is that SAT ≤ₚ 3-SAT: any formula can be converted to an equisatisfiable 3-CNF formula in polynomial time.
So 3-SAT is at least as hard as SAT.
| Notion | What it transforms | Output type | Used for | Typical symbol |
|---|---|---|---|---|
| Many-one reduction | Single instance | Single instance | NP-completeness proofs | ≤ₚ |
| Turing reduction | Instance with oracle calls | Multiple queries | Finer-grained complexity | ≤ᵀ |
| Cook reduction (older usage) | Oracle-based | Multiple queries | Sometimes for NP-hardness | (varies) |
For standard NP-completeness in algorithms courses, ≤ₚ means Karp (many-one) reduction.
When you propose f:
A reduction must be constructive and efficient. You are not allowed to “guess a satisfying assignment” during the reduction.
NP-completeness proofs can look creative, but they are remarkably templated. The creativity is mostly in designing the right gadgets; the overall logic is consistent.
To prove a problem X is NP-complete, you usually do:
1) Show X ∈ NP (verification)
2) Show A ≤ₚ X for some known NP-complete A (hardness)
The second step is where reductions happen.
You must identify:
Examples of certificates:
Verifier logic: check the claimed structure quickly.
This step is often easy, but it is not optional.
You almost never reduce from “an arbitrary NP problem.” Instead, reduce from something already known NP-complete.
Good sources:
Pick a source that “resembles” your target.
You are not required to re-prove Cook–Levin each time, but understanding what it says helps you trust SAT as a starting point.
Cook–Levin idea (conceptual):
So SAT captures the existence of a polynomially verifiable witness.
A common chain:
SAT ≤ₚ 3-SAT ≤ₚ (your problem)
So you can use 3-SAT directly as the NP-complete source.
3-SAT instance format:
When proving A ≤ₚ X:
1) Define f: Given an instance I of A, construct instance f(I) of X.
2) Polynomial time: argue construction is O(poly(size(I))).
3) Correctness:
Your correctness argument often relies on mapping:
Most NP-complete problems have this feel:
3-SAT: choose truth values; satisfy all clauses.
Graph coloring: choose colors; satisfy adjacency constraints.
Clique: choose vertices; satisfy pairwise adjacency.
Reductions translate one set of choices/constraints into another.
Even if f is conceptually correct, it must not blow up the instance exponentially.
If your input has n variables and m clauses, a typical gadget reduction produces:
If you produce something like 2ⁿ vertices, you have not built a polynomial-time reduction.
A subtle but important pacing point:
Hardness reductions cannot “use nondeterminism.” They are ordinary algorithms.
If X is NP-complete, then:
In practice, we treat NP-completeness as strong evidence that we should not search for exact poly-time algorithms for the general case.
NP-completeness is about worst-case asymptotic complexity. It does not mean:
Once you have an NP-completeness classification, you usually pivot to one of these strategies:
| Strategy | Goal | When it works | Example follow-up node |
|---|---|---|---|
| Approximation algorithms | Near-optimal solutions provably close to best | Optimization versions with structure | Approximation Algorithms |
| Parameterized algorithms | Exponential in a small parameter k, poly in n | Small treewidth, small k, etc. | (often separate node) |
| Special cases / restrictions | Exact poly-time on restricted inputs | Planar graphs, bounded degree, etc. | Graph Coloring |
| Heuristics / SAT solvers / ILP | Fast in practice on many instances | Industrial instances, structured constraints | (SAT/ILP tooling) |
SAT is not just theoretical. Modern SAT solvers can handle massive instances and are used in:
So NP-completeness does not imply “never solvable”—it implies “don’t expect a guaranteed polynomial-time exact algorithm for all instances.”
Even outside NP-completeness, reductions teach you how to:
These are core skills in algorithms, complexity, and even systems (e.g., compiling one representation into another).
NP-completeness lives inside a bigger ecosystem:
If you continue into Computational Complexity Theory, you’ll see that the same proof discipline (define reductions carefully; track resources) generalizes to other resources besides time.
Goal: prove CLIQUE is NP-hard by reducing from 3-SAT.
CLIQUE decision problem:
Input: graph G = (V, E) and integer k.
Question: does G contain a clique of size k (a set of k vertices all pairwise adjacent)?
3-SAT instance: φ with m clauses C₁,…,Cₘ, each clause has 3 literals.
We will construct (G, k) such that φ is satisfiable ⇔ G has a clique of size k.
Construction f(φ):
For each clause Cᵢ and each literal ℓ in Cᵢ, create a vertex v(i, ℓ).
So there are 3m vertices total (one per literal occurrence).
Add edges to represent compatibility:
Connect v(i, ℓ) to v(j, ℓ′) iff:
In other words, do NOT connect contradictory literals like x and ¬x.
Set k = m (we want to pick one literal from each clause).
Polynomial time check:
We create O(m) vertices and compare pairs across clauses.
There are O((3m)²) = O(m²) potential edges, each decided by a constant-time consistency check.
So the construction is polynomial in m (and in the size of φ).
Correctness (→): assume φ is satisfiable.
Then there exists a truth assignment that makes every clause true.
For each clause Cᵢ, pick one literal ℓᵢ in Cᵢ that is true under the assignment.
Consider the vertices v(i, ℓᵢ) for i = 1…m.
Any two chosen literals are simultaneously true, so they cannot be contradictory.
Thus for i ≠ j, v(i, ℓᵢ) is connected to v(j, ℓⱼ).
Therefore these m vertices form a clique of size m.
Correctness (←): assume G has a clique of size m.
Because edges only connect vertices from different clauses, the clique can contain at most one vertex per clause.
Having size m implies it contains exactly one vertex from each clause: v(1, ℓ₁), …, v(m, ℓₘ).
Because it is a clique, no pair (ℓᵢ, ℓⱼ) is contradictory.
Now define an assignment by setting variables to satisfy all selected literals (this is consistent because no contradictions appear).
Then each clause Cᵢ has its selected literal ℓᵢ satisfied, so every clause is satisfied.
Hence φ is satisfiable.
Insight: This reduction turns “choose one true literal per clause, consistently” into “choose one vertex per clause that are all pairwise compatible.” CLIQUE’s pairwise adjacency condition is doing the global consistency work.
Goal: show VERTEX COVER is NP-complete. We’ll focus on the hardness part using a chain of reductions.
VERTEX COVER decision problem:
Input: graph G = (V, E) and integer k.
Question: is there a set S ⊆ V with |S| ≤ k such that every edge has at least one endpoint in S?
We will use known relationships between:
Key graph facts:
1) S is an independent set in G ⇔ V \ S is a vertex cover in G.
2) S is a clique in G ⇔ S is an independent set in the complement graph G̅.
We’ll start from the previous example: 3-SAT ≤ₚ CLIQUE.
Step A: CLIQUE ≤ₚ INDEPENDENT SET
Given an instance (G, k) of CLIQUE, construct the complement graph G̅.
Set k′ = k.
Claim:
G has a clique of size k ⇔ G̅ has an independent set of size k.
Proof of the claim:
Let S ⊆ V.
S is a clique in G means:
∀u, v ∈ S with u ≠ v, (u, v) ∈ E(G).
In the complement graph:
(u, v) ∈ E(G̅) ⇔ (u, v) ∉ E(G).
So:
S is a clique in G
⇔ ∀u ≠ v in S, (u, v) ∉ E(G̅)
⇔ S is an independent set in G̅.
Polynomial time:
Constructing G̅ takes O(|V|²) time if using an adjacency matrix (or similar polynomial time in the input size).
Step B: INDEPENDENT SET ≤ₚ VERTEX COVER
Given (H, k) for INDEPENDENT SET, output (H, |V(H)| − k) for VERTEX COVER.
Claim:
H has an independent set of size k ⇔ H has a vertex cover of size |V| − k.
Proof of the claim:
(→) Suppose S is an independent set with |S| = k.
Consider T = V \ S.
Any edge cannot have both endpoints in S (since S is independent), so every edge has at least one endpoint in T.
Thus T is a vertex cover.
And |T| = |V| − |S| = |V| − k.
(←) Suppose T is a vertex cover with |T| = |V| − k.
Let S = V \ T.
If there were an edge entirely within S, that edge would have neither endpoint in T, contradicting that T covers all edges.
So S is independent and |S| = k.
Chain them:
3-SAT ≤ₚ CLIQUE ≤ₚ INDEPENDENT SET ≤ₚ VERTEX COVER.
By transitivity of ≤ₚ:
3-SAT ≤ₚ VERTEX COVER.
So VERTEX COVER is NP-hard.
Finish NP-completeness with membership:
VERTEX COVER ∈ NP because a certificate is the set S of vertices, and a verifier checks in O(|E|) time that every edge has an endpoint in S.
Insight: Not every NP-completeness proof needs a fresh gadget reduction. Sometimes you can reuse a powerful reduction once (like 3-SAT ≤ₚ CLIQUE) and then move between graph problems using crisp equivalences.
Goal: justify why 3-SAT can be used as a canonical NP-complete source.
We want a polynomial-time transformation that takes an arbitrary Boolean formula φ and outputs a 3-CNF formula ψ such that:
φ is satisfiable ⇔ ψ is satisfiable.
A full construction has multiple stages (e.g., convert to CNF, then to 3-CNF). Here we focus on the core 3-literal clause normalization step that preserves satisfiability.
Assume we already have a CNF formula:
φ = ∧ᵢ Cᵢ
where each clause Cᵢ is a disjunction of literals and may have length rᵢ different from 3.
Case 1: clause length r = 1.
C = (ℓ)
Convert to:
(ℓ ∨ a ∨ b) ∧ (ℓ ∨ a ∨ ¬b) ∧ (ℓ ∨ ¬a ∨ b) ∧ (ℓ ∨ ¬a ∨ ¬b)
for fresh variables a, b.
Reason:
So the conjunction is satisfiable ⇔ ℓ is satisfiable.
Case 2: clause length r = 2.
C = (ℓ₁ ∨ ℓ₂)
Convert to:
(ℓ₁ ∨ ℓ₂ ∨ a) ∧ (ℓ₁ ∨ ℓ₂ ∨ ¬a)
for fresh a.
Reason:
So satisfiable ⇔ satisfiable.
Case 3: clause length r > 3.
C = (ℓ₁ ∨ ℓ₂ ∨ … ∨ ℓᵣ)
Introduce fresh variables y₁,…,yᵣ₋₃ and replace C by:
(ℓ₁ ∨ ℓ₂ ∨ y₁) ∧ (¬y₁ ∨ ℓ₃ ∨ y₂) ∧ … ∧ (¬yᵣ₋₄ ∨ ℓᵣ₋₂ ∨ yᵣ₋₃) ∧ (¬yᵣ₋₃ ∨ ℓᵣ₋₁ ∨ ℓᵣ)
Reason (→):
If C is true, pick the first true literal among ℓⱼ and set the y’s to satisfy the chain.
Reason (←):
If the chain is satisfiable, at least one ℓⱼ must be true; otherwise the chain forces contradictions through y variables.
Polynomial size/time:
Each clause of length r becomes O(r) clauses, using O(r) fresh variables.
Total blow-up is linear in the total number of literals across all clauses, hence polynomial.
Insight: The heart of SAT ≤ₚ 3-SAT is the idea of adding fresh variables to ‘thread’ a long clause into a chain of 3-clauses without changing satisfiability. You preserve the existence of a satisfying assignment, not necessarily the exact same set of satisfying assignments.
NP-complete means: (i) the problem is in NP (poly-time verifiable), and (ii) every NP problem reduces to it via ≤ₚ.
A polynomial-time many-one reduction A ≤ₚ B is a poly-time computable function f with x ∈ A ⇔ f(x) ∈ B.
To prove a new problem X is NP-hard, reduce from a known NP-complete problem A to X (A ≤ₚ X), not the other way around.
Correctness of reductions is almost always proven by two implications: YES(A) → YES(B) and YES(B) → YES(A).
SAT is NP-complete by Cook–Levin; 3-SAT is NP-complete and is a convenient hub for gadget reductions.
Reductions must be efficient: both runtime of f and the size ‖f(x)‖ must be polynomial in ‖x‖.
Once a problem is NP-complete, the usual next steps are approximation, parameterization, special cases, or practical solvers—not expecting a general exact poly-time algorithm.
Reversing the reduction direction: proving X ≤ₚ 3-SAT does not show X is NP-hard; you need 3-SAT ≤ₚ X (or another known NP-complete source).
Forgetting to prove X ∈ NP (verifiability) and only proving NP-hardness, which gives NP-hard but not NP-complete.
Using a reduction that implicitly solves the source problem (e.g., ‘pick a satisfying assignment’ during construction), which violates the polynomial-time construction requirement.
Not proving both directions of correctness (→ and ←), leaving a gap where the constructed instance might introduce or lose solutions.
Show that 3-SAT ≤ₚ INDEPENDENT SET by composing known reductions, and state the resulting k parameter in the INDEPENDENT SET instance.
Hint: Use the worked example 3-SAT ≤ₚ CLIQUE and the fact that CLIQUE in G corresponds to INDEPENDENT SET in the complement graph G̅ with the same k.
From the worked example, build (G, k) from φ with k = m (number of clauses) such that φ satisfiable ⇔ G has a clique of size m.
Now map (G, m) to (G̅, m).
Because S is a clique in G ⇔ S is an independent set in G̅, we get:
φ satisfiable ⇔ G̅ has an independent set of size m.
Thus 3-SAT ≤ₚ INDEPENDENT SET with k = m.
Prove that VERTEX COVER ∈ NP by explicitly describing a certificate and a polynomial-time verifier. Give a runtime bound in terms of |V| and |E|.
Hint: The certificate can be the set S of chosen vertices. The verifier checks every edge.
Certificate: a list (or bitmask) indicating a set S ⊆ V with |S| ≤ k.
Verifier V(G, k, S):
1) Check |S| ≤ k.
2) For each edge (u, v) ∈ E, check whether u ∈ S or v ∈ S; if any edge fails, reject.
If all edges pass, accept.
Runtime: step (1) is O(|V|) or O(|S|); step (2) checks each edge once, O(|E|), with O(1) membership tests if using a bitmask/boolean array. Total O(|V| + |E|), polynomial.
Let A, B, C be decision problems. Suppose A ≤ₚ B and B ≤ₚ C. Prove that A ≤ₚ C (transitivity) by explicitly constructing the reduction function and showing it is polynomial-time.
Hint: Compose the two reduction functions: f from A to B and g from B to C. Use that poly(poly(n)) is poly(n).
Given A ≤ₚ B via f and B ≤ₚ C via g.
Define h(x) = g(f(x)).
Correctness:
x ∈ A ⇔ f(x) ∈ B (definition of f)
⇔ g(f(x)) ∈ C (definition of g)
⇔ h(x) ∈ C.
Polynomial time:
If f runs in time p(‖x‖) and outputs size ‖f(x)‖ ≤ q(‖x‖), and g runs in time r(‖y‖) on input y, then g(f(x)) runs in time r(‖f(x)‖) ≤ r(q(‖x‖)).
Total time = p(‖x‖) + r(q(‖x‖)), which is polynomial in ‖x‖ because compositions/sums of polynomials are polynomials. Hence A ≤ₚ C.