Space complexity, circuit complexity, derandomization.
Quick unlock - significant prerequisite investment but simple final step. Verify prerequisites first.
Why do some problems resist every clever optimization we try? Complexity theory answers this by changing the question from “what’s the fastest algorithm?” to “what are the fundamental limits of computation under a resource budget?” In this node you’ll treat space as a first-class resource, study non-uniform circuits as an alternative computation model, and see how pseudorandomness enables derandomization—replacing random bits with structure without losing power.
Computational complexity theory studies what can be computed under resource bounds. This lesson focuses on (1) space complexity and landmark results like Savitch’s theorem and Immerman–Szelepcsényi, (2) circuit complexity via families {Cₙ} with size/depth as intrinsic measures (P/poly, AC⁰, NC¹), and (3) derandomization via pseudorandom generators PRG: {0,1}ˢ → {0,1}ᵐ that “fool” a class so randomness can be eliminated or reduced.
Computational complexity theory is the study of computation under constraints. Instead of asking “is problem X solvable?”, we ask:
The central idea is that resources are currencies. Different models of computation charge you in different currencies, but the goal is to understand which costs are unavoidable.
If you only measure time, you can miss important structure:
Complexity theory builds a “map” of classes—sets of languages (decision problems) solvable within given resource bounds—and studies containments, separations, and equivalences.
This node emphasizes three atomic concepts:
Space counts the number of work tape cells a Turing machine uses, as a function of input length n. We write classes like SPACE(f(n)) (and special cases like L, NL, PSPACE).
Instead of a single algorithm, we consider a family of Boolean circuits {Cₙ}, one circuit per input length n. Complexity is measured via circuit size (number of gates) and depth (longest path). This leads to classes like P/poly, AC⁰, NC¹.
Randomness can be treated as a resource (BPP, RP, ZPP). A pseudorandom generator (PRG) is a function
PRG: {0,1}ˢ → {0,1}ᵐ
that stretches a short seed into a longer string that looks random to some computationally bounded observer. If a PRG fools the class that a randomized algorithm uses, we can replace randomness with a loop over seeds, turning randomized computation into deterministic computation.
All three pillars ask variations of the same question:
If I restrict computation (memory, circuit resources, randomness), what power remains—and what tradeoffs exist between resources?
Space can sometimes simulate time (and vice versa) but with nontrivial overhead. Circuits capture non-uniform “advice-like” computation. PRGs connect hardness (things that are difficult for circuits) to randomness elimination.
This lesson will move slowly through these ideas, with the goal of building intuition and a working technical toolkit.
Space is not just “time but smaller.” Memory is a bottleneck in real systems, but more importantly, space has unique mathematical properties:
A deterministic Turing machine M decides a language L in space f(n) if on every input x of length n it uses at most O(f(n)) work tape cells. The input tape is read-only and does not count (standard convention).
Special named classes:
A key detail: for f(n) < log n, you can’t even index the input properly, so most interesting classes start at log n.
Fix an input x of length n. Suppose M runs in space s(n). A configuration encodes:
So the number of configurations is at most:
|Conf(M, x)| ≤ 2^{O(s(n))} · poly(n)
Often we simplify to:
|Conf(M, x)| ≤ 2^{O(s(n))}
because poly(n) is dominated by 2^{O(s(n))} once s(n) ≥ log n.
Now build a directed graph Gₓ where nodes are configurations, and edges represent one valid transition of M. Then:
This reframes space-bounded computation as graph reachability in an exponentially large (but implicitly defined) graph.
Savitch’s theorem is one of the crown jewels of space complexity:
For s(n) ≥ log n,
NSPACE(s(n)) ⊆ SPACE(s(n)²).
Meaning: nondeterminism does not buy you exponential power in space; it costs only a quadratic blowup.
In the configuration graph, nondeterministic acceptance is reachability. Reachability can be solved by a recursive “meet-in-the-middle” strategy that uses space to store only:
Define a predicate:
REACH(u, v, t) = “is there a path from u to v of length ≤ t?”
We want REACH(start, accept, T) where T is the number of configurations (so any path can be assumed to have length ≤ T).
Use the recurrence:
REACH(u, v, t) is true iff ∃ midpoint m such that
REACH(u, m, ⌊t/2⌋) ∧ REACH(m, v, ⌊t/2⌋)
This is a nondeterministic-looking statement, but we can implement it deterministically by looping over all possible midpoints m (all configurations). The recursion depth is O(log t) = O(s(n)). Each level stores O(s(n)) bits for u, v, t. Total space:
O(s(n)) (per level) × O(log t) (levels)
Since t ≤ 2^{O(s(n))}, we have log t = O(s(n)). Thus space is O(s(n)²).
So:
NSPACE(s) ⊆ SPACE(s²).
This is profoundly different from time complexity, where we do not know anything like NTIME(t) ⊆ TIME(t²) in general.
Another striking result:
NL = coNL.
So if a logspace nondeterministic machine can decide a language, it can decide its complement too.
In NP, we do not know whether NP = coNP. The fact that NL does equal coNL is a reminder that space behaves differently than time.
NL-completeness of s-t reachability (is there a path from s to t in a directed graph) suggests that complement asks “no path exists,” which sounds global.
Immerman–Szelepcsényi shows that in logspace nondeterminism, you can count (modestly) the number of reachable nodes layer by layer, and verify that t is not among them, using only O(log n) space for counters.
The key trick is that nondeterminism can be used to certify counts by guessing a set and verifying it consistently, without storing the whole set.
PSPACE includes many “game-like” and “quantified” problems (QBF is PSPACE-complete). You can view PSPACE as problems solvable with polynomial memory—even if time might be exponential.
A useful containment chain (all believed strict):
P ⊆ NP ⊆ PSPACE ⊆ EXPTIME
Two meta-intuitions:
| Resource | Typical class | Canonical complete problem | Intuition |
|---|---|---|---|
| log space | L / NL | s-t reachability (NL) | tiny memory, streaming-like |
| poly space | PSPACE | QBF | explore exponential possibilities with reuse |
| poly time | P | (varies) | efficient sequential computation |
Space complexity provides tools (configuration graphs, reachability, recursion) that will reappear when we talk about circuits and derandomization.
A Turing machine is uniform: one finite description handles all input lengths. Circuits are non-uniform: you can have a different circuit for each n.
Why is this useful?
A Boolean circuit Cₙ takes n input bits and outputs 0/1. A circuit family {Cₙ} decides a language L if for all x ∈ {0,1}ⁿ:
Cₙ(x) = 1 ⇔ x ∈ L.
Non-uniformity means there may be no efficient algorithm that, given n, outputs Cₙ. The family can be arbitrary as long as each Cₙ is finite.
To control this, we often study:
Equivalently (important viewpoint): P/poly = poly-time computation with polynomial advice. Advice is a string aₙ of length poly(n) that depends only on n, not on the particular input x.
Two primary complexity measures for circuits:
Depth corresponds to parallel time: if all gates at a level compute simultaneously, depth is time steps.
This yields classes like:
A central theme: depth is powerful. Allowing depth to grow from O(1) to O(log n) dramatically increases what you can compute.
| Model | Strength | Weakness |
|---|---|---|
| Turing machine (uniform) | single algorithm, realistic | harder to prove lower bounds |
| Circuits (non-uniform) | captures per-n optimization; lower bounds are concrete | may encode “magic” per input length |
Non-uniformity can be surprisingly strong. For instance, it’s known that if NP ⊆ P/poly then the polynomial hierarchy collapses (a major unlikely event). So proving NP ⊄ P/poly is a prominent goal.
AC⁰ circuits are extremely shallow. They can compute many basic operations, but they cannot compute some simple global properties efficiently.
Classic results (you don’t need proofs here, but you should know the statements):
These theorems are examples of circuit lower bounds, among the few strong lower bounds we can prove unconditionally.
Consider computing XOR of n bits.
This highlights that depth restrictions create qualitative changes.
Circuits and space are not identical currencies, but there are translations:
One useful heuristic:
This is why circuits sit in the middle of this node: they connect space and pseudorandomness.
Randomness is a resource. Many algorithms use random bits to simplify design or speed up computation (e.g., randomized primality testing historically, hashing, sampling).
Complexity theory asks:
Derandomization is the program of removing or reducing randomness.
A core belief in complexity theory is:
BPP = P
But this is unproven.
A pseudorandom generator is a deterministic function
PRG: {0,1}ˢ → {0,1}ᵐ
where s ≪ m (seed shorter than output). If you pick a uniform seed z ∈ {0,1}ˢ, the output PRG(z) should “look uniform” to a restricted observer.
The observer is typically a circuit or algorithm from some class 𝒞. Then PRG is said to fool 𝒞 if no member of 𝒞 can distinguish PRG(z) from a truly uniform m-bit string.
Formally, for a class 𝒞 of distinguishers D: {0,1}ᵐ → {0,1}, PRG ε-fools 𝒞 if for all D ∈ 𝒞:
| Pr[D(Uₘ)=1] − Pr[D(PRG(Uₛ))=1] | ≤ ε
where Uₖ denotes the uniform distribution over {0,1}ᵏ.
Suppose you have a randomized algorithm A(x; r) that runs in poly(n) time and uses m(n) random bits r.
Assume we have a PRG with seed length s(n) that fools the class of computations that A performs on its random tape.
Then we can define a deterministic algorithm:
Runtime blowup: 2ˢ times poly(n).
So if s(n) = O(log n), then 2ˢ = poly(n), and we get a polynomial-time deterministic simulation. In other words:
If we can build PRGs with logarithmic seed that fool BPP computations, then BPP ⊆ P.
This is the core “replace randomness with structure” story.
PRGs are not magic; they are usually built from hard functions.
The celebrated intuition (made rigorous in many frameworks):
If there exists a function that is hard for small circuits, then we can construct PRGs that fool small circuits.
This is called a hardness vs randomness connection.
At a high level:
This is why circuit lower bounds are so valuable: even modest lower bounds can imply derandomization results.
There are PRGs tailored to space-bounded computation (e.g., Nisan’s generator). The key phenomenon:
One common parameter form you should remember:
This is conceptually powerful: small-space algorithms can often be derandomized more directly than general polynomial-time algorithms.
Derandomization usually trades randomness for one (or more) of:
The main lesson: randomness is a resource that can sometimes be “compressed” into a short seed—if your computational model is too weak to notice the difference.
This node’s three topics reinforce each other:
A useful mental model is a triangle:
The NL-complete problem s-t reachability (directed graph reachability) sits at the intersection of these ideas.
Even if you don’t go deep into specific constructions, the pattern matters:
Space-bounded computation has a small state space ⇒ it is more susceptible to pseudorandom simulation.
P/poly is important as a “ceiling” for many uniform classes:
That second containment is worth pausing on. Here is the intuition:
A BPP machine succeeds with probability ≥ 2/3 over random strings r. For each input length n, by averaging, there exists a fixed randomness string (or a small set) that works well on many inputs. With polynomial advice, you can store such helpful random strings for each n, yielding non-uniform simulation.
So even without full derandomization (BPP = P), randomness often collapses under non-uniformity.
Many landmark conditional results have the form:
If ∃ explicit f with circuit complexity ≥ 2^{Ω(n)} (or strong enough lower bound)
⇒ strong PRGs exist
⇒ BPP = P (or other derandomization conclusions)
Why this matters pedagogically:
You’re aiming for these capabilities:
These are foundational tools for advanced nodes on interactive proofs, hardness vs randomness in detail, and fine-grained complexity.
Let M be a nondeterministic Turing machine using space s(n) on inputs of length n (with s(n) ≥ log n). Fix an input x. Consider the configuration graph Gₓ of M on x. Show how a deterministic algorithm can decide whether M accepts x using O(s(n)²) space.
Model nondeterminism as reachability:
Bound the number of configurations:
A configuration stores:
So total bits = O(s(n)).
Thus the number of configurations N satisfies:
N ≤ 2^{O(s(n))}.
Reduce to bounded-length reachability:
Any accepting path can be taken to have length ≤ N (if a path repeats a configuration, remove the cycle).
So it suffices to decide REACH(c_start, c_acc, N).
Define the recursive predicate REACH(u, v, t):
REACH(u, v, t) = “there exists a path from u to v of length ≤ t”.
Base cases:
Recursive step via midpoint:
For t > 1, compute:
REACH(u, v, t) = OR over all midpoints m of
REACH(u, m, ⌊t/2⌋) AND REACH(m, v, ⌊t/2⌋).
Implement this deterministically by looping m over all configurations.
Space accounting (why it’s s²):
At any recursion level, store:
So per level: O(s + log t).
Depth of recursion: O(log t).
Since t ≤ N ≤ 2^{O(s)}, we have:
log t ≤ O(s).
Therefore:
Per level space = O(s)
Number of levels = O(s)
Total space = O(s · s) = O(s²).
Insight: Savitch’s theorem is really a statement about reachability in an implicitly defined graph with 2^{O(s)} nodes: you can solve reachability with only O(log N) recursion depth and O(log N) bits per stored node, leading to O((log N)²) = O(s²) space.
You have a randomized algorithm A(x; r) that runs in time poly(n) and uses m(n) random bits r ∈ {0,1}ᵐ. Assume there is a PRG G: {0,1}ˢ → {0,1}ᵐ that ε-fools the class of distinguishers induced by A’s acceptance predicate, with ε ≤ 1/10. Show how to build a deterministic algorithm and bound its runtime when s(n) = O(log n).
Interpret A as a boolean function of randomness:
For fixed input x, define fₓ(r) = 1 if A(x; r) accepts, else 0.
Then A’s acceptance probability is Pr[fₓ(Uₘ)=1].
Use the PRG guarantee:
Since G ε-fools the relevant class,
| Pr[fₓ(Uₘ)=1] − Pr[fₓ(G(Uₛ))=1] | ≤ ε.
So the acceptance probability under pseudorandom bits is close to acceptance under truly random bits.
Define a deterministic simulator D:
On input x:
1) For every seed z ∈ {0,1}ˢ:
2) Output the majority value over all seeds.
Why majority works:
Assume A is BPP with error ≤ 1/3, so either:
Under G(Uₛ), probabilities shift by at most ε ≤ 1/10:
So the majority over seeds matches the correct answer.
Runtime bound:
There are 2ˢ seeds.
Each seed evaluation runs A plus G in poly(n) time.
So total time = 2ˢ · poly(n).
If s(n) = O(log n), then 2ˢ = poly(n), hence total time remains poly(n).
Insight: Derandomization often reduces to one move: replace a random choice over {0,1}ᵐ with a pseudorandom choice generated from {0,1}ˢ, then enumerate all seeds. The entire challenge is constructing G with small s that fools the computation you care about.
Show the equivalence between: (1) L ∈ P/poly (polynomial-size circuit family), and (2) L is decidable by a polynomial-time Turing machine with polynomial-length advice strings aₙ depending only on n.
Direction (circuits ⇒ advice):
Assume L has circuits {Cₙ} of size ≤ nᵏ.
Define advice aₙ to be an encoding of Cₙ (gate list and wiring) of length poly(n).
On input x of length n, the TM:
Direction (advice ⇒ circuits):
Assume there is a poly-time TM M and advice strings aₙ with |aₙ| ≤ nᵏ such that M(x, aₙ) decides L.
For each n, build a circuit Cₙ that hardwires aₙ as constants and simulates M’s computation on n-bit inputs.
Since M runs in poly(n) time, the simulation yields a circuit of poly(n) size.
Thus {Cₙ} is a polynomial-size circuit family deciding L.
Insight: Non-uniformity is “information per input length.” P/poly is exactly polynomial-time computation plus a polynomial-sized hint that depends only on n. This framing is crucial when comparing uniform derandomization (BPP vs P) to non-uniform collapses (BPP ⊆ P/poly).
Space is a distinct computational resource with powerful structural tools (configuration graphs, reachability) that enable theorems unlike those in time complexity.
For s(n) ≥ log n, NSPACE(s(n)) ⊆ SPACE(s(n)²) (Savitch): nondeterministic space can be simulated deterministically with only quadratic space overhead.
NL = coNL (Immerman–Szelepcsényi): logspace nondeterminism is closed under complement, contrasting with the open NP vs coNP question.
Circuit families {Cₙ} provide a non-uniform computation model; size and depth measure intrinsic complexity (hardware cost and parallel time).
P/poly can be understood as polynomial-size circuits or equivalently as polynomial-time algorithms with polynomial advice aₙ.
A PRG G: {0,1}ˢ → {0,1}ᵐ fools a class if bounded distinguishers cannot tell G(Uₛ) from Uₘ; small seed length enables derandomization by enumerating seeds.
Derandomization is a resource tradeoff: you typically exchange randomness for time (seed enumeration) plus either structure (PRGs) or assumptions (hardness).
Hardness vs randomness links circuit lower bounds to PRGs and derandomization, explaining why progress on either side can unlock the other.
Confusing time and space: a poly-time algorithm need not be poly-space and vice versa; PSPACE problems may require exponential time.
Forgetting that circuit families are non-uniform: assuming there must be an efficient procedure to generate Cₙ from n (uniformity is an extra condition).
Assuming “PRG exists” automatically implies P = BPP: the PRG must fool the right class with sufficiently small seed length (typically O(log n)) and small error.
Treating “BPP ⊆ P/poly” as equivalent to “BPP = P”: non-uniform containments can hold even when uniform derandomization is unknown.
Let M be a deterministic TM that uses space s(n) on inputs of length n, with s(n) ≥ log n. Argue that on any fixed input x, either M halts within at most 2^{O(s(n))} steps or it loops forever.
Hint: Count the number of distinct configurations possible under space s(n). What happens if a configuration repeats?
A configuration is determined by (state, head positions, work tape contents). With space s(n), the total description length is O(s(n)), so the number of configurations is N ≤ 2^{O(s(n))}. If M runs longer than N steps without halting, by the pigeonhole principle some configuration repeats. Since M is deterministic, repeating a configuration implies the future evolution repeats as well, so M loops forever. Therefore if it halts, it must do so within at most N = 2^{O(s(n))} steps.
Suppose a randomized algorithm A(x; r) for inputs of length n uses m(n) random bits and runs in time poly(n). You have a PRG G: {0,1}ˢ → {0,1}ᵐ that fools A’s acceptance predicate with error ε = 1/100. Show that enumerating all seeds yields a deterministic algorithm with runtime 2ˢ · poly(n). Under what condition on s(n) is this runtime polynomial?
Hint: Write the deterministic algorithm explicitly and count the number of seeds.
Deterministic algorithm: for all seeds z ∈ {0,1}ˢ, compute r = G(z), run A(x; r), and output majority. There are 2ˢ seeds, each run costs poly(n), so runtime is 2ˢ · poly(n). This is polynomial when 2ˢ ≤ poly(n), i.e., when s(n) = O(log n).
Prove that if a language L has a polynomial-size circuit family {Cₙ}, then L ∈ P/poly (advice formulation). Explicitly describe what the advice string aₙ contains and how the TM uses it.
Hint: Advice can be an encoding of the circuit’s gate list and wiring; evaluation is straightforward.
Let size(Cₙ) ≤ nᵏ. Define advice aₙ to be a canonical encoding of Cₙ: for each gate, store its type (AND/OR/NOT), its input wire indices, and identify which wires correspond to the n input bits; also identify the output wire. The encoding length is O(size(Cₙ) · log size(Cₙ)) = poly(n). A polynomial-time TM on input x of length n reads aₙ, reconstructs the circuit description, evaluates gates in topological order, and outputs the value on the output wire. Thus L is decidable in polynomial time with polynomial advice, meaning L ∈ P/poly.