Maximal sets where every vertex reaches every other.
Self-serve tutorial - low prerequisites, straightforward concepts.
In a directed graph, “being connected” is asymmetric: u might reach v without v being able to get back. Strongly Connected Components (SCCs) are the clean way to carve a directed graph into maximal “mutual reachability” regions—subgraphs where every vertex can get to every other vertex.
An SCC is a maximal set of vertices where ∀u, v in the set, u → v and v → u. SCCs partition the vertices of a directed graph. You can find them in linear time O(V + E) using Kosaraju’s or Tarjan’s algorithm, and collapsing SCCs produces a DAG (the “condensation graph”).
In undirected graphs, “connected” naturally means there’s a path between any two nodes, and that relationship is symmetric. Directed graphs break that symmetry: a link u → v lets you go from u to v, but not necessarily back.
This matters in real systems:
We often want to find regions of the graph that behave like undirected connectivity—places where movement is possible in both directions (via directed paths). SCCs are exactly that.
We’ll use the notation from the node profile:
A directed path is a sequence of vertices:
Reachability is:
Two vertices u and v are strongly connected if they can reach each other:
This is symmetric by definition. Intuitively, u and v are in a “mutual reachability loop,” possibly through many intermediate vertices.
An SCC is a maximal set C of vertices such that every pair in C is strongly connected:
“Maximal” means you can’t add any other vertex to C without breaking the property.
We’ll justify these facts carefully in the next section.
Suppose we have edges:
Then:
That “cycle vs one-way bridge” feeling is the core mental model: SCCs are the maximal “mutual reachability islands,” and edges between islands are one-way.
Partitioning is powerful: if SCCs split V into disjoint groups, you can solve many directed-graph problems by working on the smaller “component graph.” To trust that reduction, we need to know SCCs are well-defined and don’t overlap in messy ways.
Define u ~ v (“u is strongly connected to v”) iff:
We’ll show ~ is an equivalence relation, meaning it is:
Once we prove that, the equivalence classes of ~ are exactly the SCCs.
For any vertex u, u →* u via the length-0 path. So:
Therefore u ~ u.
If u ~ v, then by definition:
Swapping u and v gives:
So v ~ u.
Assume:
Then:
We want to show u ~ w, i.e. u → w and w → u.
Use transitivity of reachability:
So u ~ w.
Equivalence relations partition a set into disjoint equivalence classes.
These equivalence classes are precisely SCCs:
Now contract each SCC into a super-node. Add a directed edge between SCC A and SCC B if there exists an edge u → v in the original graph with u ∈ A and v ∈ B, A ≠ B.
Claim: the condensation graph has no directed cycles.
Proof sketch (by contradiction):
More explicitly: take vertices a ∈ C₁ and b ∈ C₂.
So a and b are strongly connected, meaning they should be in the same SCC—contradiction.
Therefore the condensation graph is a DAG.
This gives a standard workflow:
Many tasks—deadlock detection, cycle detection in directed graphs, program analysis (mutually recursive functions), dependency resolution—reduce cleanly once you have SCCs.
You already know BFS/DFS explores reachability from a starting node. But SCCs ask for mutual reachability and maximal sets. A single DFS tree doesn’t directly reveal where the “return paths” exist.
Efficient SCC algorithms exploit a key idea:
We’ll focus on two classic linear-time algorithms:
Both run in O(V + E) time.
If you collapse SCCs, you get a DAG. In a DAG, there are “sources” with no incoming edges. Kosaraju’s algorithm uses DFS finishing times to find SCCs in an order that effectively peels off those components.
A key tool is the transpose graph Gᵀ, where every edge is reversed.
SCCs are the same in G and Gᵀ (reversing all edges doesn’t change mutual reachability).
Given a directed graph G = (V, E):
1) Run DFS on G, recording vertices in order of decreasing finish time.
2) Build Gᵀ (reverse all edges).
3) Process vertices in the order from step (1) (largest finish time first). For each unvisited vertex v in Gᵀ:
The finish-time order from step (1) tends to place vertices from “upstream” SCCs (in the condensation DAG) later, so they are processed first in step (3). When you reverse edges, DFS from a component can’t leak into components that were “downstream” in the original condensation DAG, so each DFS cleanly captures exactly one SCC.
Total: O(V + E)
Tarjan discovers SCCs during a single DFS by tracking how far back in the current DFS recursion you can reach.
It maintains:
When low[u] equals index[u], u is the root of an SCC: pop from the stack until u.
Let index[u] be the time u is first discovered.
Initialize:
For each neighbor v of u:
When done exploring u:
Each edge is examined a constant number of times.
| Feature | Kosaraju | Tarjan |
|---|---|---|
| DFS passes | 2 | 1 |
| Needs transpose graph Gᵀ | Yes | No |
| Implementation complexity | Low–medium | Medium |
| Typical use | Teaching, quick implementation | Production, memory-sensitive |
| Outputs SCCs in | Reverse finish-time order (via pass 2) | Order of completion of SCC roots |
Even though SCCs are defined via u →* v (existence of paths), the algorithms never enumerate all paths. They infer SCC structure using DFS properties—this is the power of graph traversal plus the right invariants.
SCCs turn a messy directed graph into a DAG of components. DAGs are much easier to reason about: they support topological sorting, dynamic programming, and clear notions of “upstream/downstream.”
Below are common patterns where SCCs are the missing piece.
A directed graph has a directed cycle iff it has an SCC with:
Reason: in an SCC, every vertex reaches itself through others, which implies cyclic structure.
So SCCs do more than answer “is there a cycle?”—they locate the maximal cyclic regions.
Once you contract SCCs, you get a DAG. Many tasks become:
Examples:
In operating systems and databases, a wait-for graph is directed: process A → process B if A is waiting for a resource held by B.
In 2-SAT, you build an implication graph. SCCs provide the satisfiability test:
(You don’t need the full 2-SAT method here, just note SCCs are the core subroutine.)
When you solve problems with SCCs, you usually want more than just the sets. You often want:
A typical post-processing step:
Then you can run standard DAG algorithms.
SCCs are not “extra structure” beyond reachability—they are the equivalence classes of mutual reachability.
You can think of them as compressing reachability into two layers:
That mental model is the bridge between definition and algorithm: DFS is a reachability engine, and SCC algorithms are careful ways of running DFS so that these equivalence classes fall out.
Consider the directed graph with vertices {1,2,3,4,5,6} and edges:
1 → 2
2 → 3
3 → 1
3 → 4
4 → 5
5 → 4
5 → 6
Find all SCCs and the condensation DAG.
Step 1: Find obvious cycles (strong connectivity candidates).
The edges 1 → 2 → 3 → 1 form a directed cycle.
So 1 → 2, 2 → 3, 3 →* 1, and by chaining, each of {1,2,3} reaches each other.
Candidate SCC: {1,2,3}.
Step 2: Check maximality for {1,2,3}.
Vertex 4 is reachable from 3 (since 3 → 4), so {1,2,3} can reach 4.
But can 4 reach back to 1 (or 2 or 3)?
From 4 we can go 4 → 5 → 6 (and also 4 ↔ 5), but there is no edge leading back to 1,2,3.
So 4 ↛* 1.
Therefore 4 is not strongly connected with 1.
So SCC {1,2,3} is maximal as is.
Step 3: Identify SCC containing 4 and 5.
We have 4 → 5 and 5 → 4, so 4 → 5 and 5 → 4.
Thus 4 ~ 5.
Check if 6 joins them:
5 → 6, so 5 →* 6.
But 6 has no outgoing edges listed, so 6 ↛* 5.
Thus 6 is not strongly connected with 4/5.
So SCC {4,5} stands.
Step 4: Determine SCC for 6.
6 reaches itself (6 →* 6) by the length-0 path.
No other vertex is mutually reachable with 6.
So SCC {6}.
Step 5: Build condensation DAG.
Contract SCC A = {1,2,3}, SCC B = {4,5}, SCC C = {6}.
Original edges crossing SCCs:
No edges go backward.
So the condensation graph is A → B → C, which is a DAG (a simple chain).
Insight: SCCs capture “islands of cycles.” Once you compress each island into a node, the remaining structure becomes acyclic and much easier to analyze.
Let G have vertices {A,B,C,D,E,F} and edges:
A → B
B → C
C → A
B → D
D → E
E → D
E → F
Use Kosaraju’s algorithm to find SCCs (show the two passes conceptually).
Step 1: DFS on G and record finish times.
One valid DFS order: start at A.
Explore A → B → C.
From C we can go to A (already discovered), so C finishes first.
Then back to B: explore B → D → E.
From E explore E → F (F finishes), then E finishes, then D finishes, then B finishes, then A finishes.
Finish order (first finished to last finished) could be:
C, F, E, D, B, A.
So decreasing finish time order is:
A, B, D, E, F, C.
Step 2: Build transpose graph Gᵀ by reversing edges.
Edges in Gᵀ:
B → A
C → B
A → C
D → B
E → D
D → E
F → E
Step 3: DFS on Gᵀ in decreasing finish time order.
Process A first (unvisited):
From A we go to C, from C to B, from B to A.
So we reach {A,B,C} in this DFS.
That is SCC₁ = {A,B,C}.
Step 4: Continue the order.
Next B is already visited.
Next D is unvisited: DFS from D in Gᵀ reaches E (via D → E) and back to D (via E → D).
It can also reach B via D → B, but B is already assigned to SCC₁ and is visited, so it won’t be included in this new DFS tree.
So SCC₂ = {D,E}.
Step 5: Next E is visited.
Next F is unvisited: DFS from F reaches only F (since F → E but E is already visited).
So SCC₃ = {F}.
Step 6: C is visited.
We are done.
SCCs found: {A,B,C}, {D,E}, {F}.
Insight: The first pass computes an order that respects the SCC DAG. The second pass on the reversed graph ensures each DFS “fills up” exactly one SCC without spilling into others.
Let G be any directed graph. Let its SCCs be contracted to form the condensation graph G_SCC. Prove G_SCC is acyclic.
Step 1: Assume for contradiction that G_SCC contains a directed cycle:
C₁ → C₂ → … → Cₖ → C₁,
where each Cᵢ is a distinct SCC.
Step 2: Pick vertices u ∈ C₁ and v ∈ C₂.
Because there is an edge C₁ → C₂ in G_SCC, there exists an edge (or path) in G from some vertex in C₁ to some vertex in C₂.
In particular, u →* v (reachability across components along the condensation edges).
Step 3: Because the SCC cycle returns to C₁, there is also a path from C₂ back to C₁ in G_SCC.
Therefore, in G there is a directed path from v to u:
v →* u.
Step 4: Combine the two reachability statements:
u → v and v → u.
So u and v are strongly connected, meaning u ~ v.
Step 5: If u ~ v, then u and v must be in the same SCC (equivalence class).
But u ∈ C₁ and v ∈ C₂ with C₁ ≠ C₂, contradicting that SCCs are disjoint.
Step 6: Therefore, the assumption was false, and G_SCC has no directed cycles.
So the condensation graph is a DAG.
Insight: Any cycle between components would create mutual reachability across them, forcing them to merge into a single SCC. The only way SCCs stay separate is if the component graph is acyclic.
Reachability u →* v means there exists a directed path from u to v; it is reflexive and transitive but not symmetric.
Strong connectivity between u and v is mutual reachability: u → v and v → u.
An SCC is a maximal set of vertices that are pairwise strongly connected; SCCs are the equivalence classes of the relation u ~ v.
SCCs partition the vertices: every vertex belongs to exactly one SCC.
Contracting SCCs yields the condensation graph, which is always a DAG.
Kosaraju finds SCCs with two DFS passes (on G and Gᵀ) in O(V + E).
Tarjan finds SCCs in one DFS pass using a stack and low-link values, also in O(V + E).
Many directed-graph problems become simpler by solving them on the SCC DAG rather than on the original graph.
Confusing a directed cycle with an SCC: an SCC can contain many cycles and paths; the definition is mutual reachability among all vertices, not just being on one cycle.
Forgetting maximality: a set may have mutual reachability but still not be an SCC if it can be expanded by adding more mutually reachable vertices.
Assuming SCC edges can form cycles in the condensation graph: they cannot; if they did, those SCCs would actually be one SCC.
Implementing Kosaraju but processing vertices in the wrong order on the second pass (it must be decreasing finish time from the first pass).
Given vertices {1,2,3,4} and edges 1 → 2, 2 → 3, 3 → 2, 3 → 4. List the SCCs and draw the condensation DAG.
Hint: Look for mutual reachability pairs. If 2 → 3 and 3 → 2, they must be in the same SCC. Check whether 1 can be reached back from that SCC, and whether 4 can reach back.
SCCs: {1}, {2,3}, {4}.
Reason: 2 ↔ 3 via edges 2 → 3 and 3 → 2, so they are strongly connected. 1 reaches 2 but 2 cannot reach 1, so 1 is alone. 3 reaches 4 but 4 cannot reach back, so 4 is alone.
Condensation DAG: {1} → {2,3} → {4}.
Prove that if u and v are in the same SCC and v and w are in the same SCC, then u and w are in the same SCC.
Hint: Translate “same SCC” into mutual reachability (→). Use transitivity of reachability to show u → w and w →* u.
If u and v are in the same SCC, then u → v and v → u.
If v and w are in the same SCC, then v → w and w → v.
By transitivity: u → v and v → w ⇒ u →* w.
Also w → v and v → u ⇒ w →* u.
So u and w are mutually reachable and thus in the same SCC.
You are given SCC labels comp[u] for every vertex u in a directed graph G. Describe how to build the condensation graph in O(V + E), and explain why the result is a DAG.
Hint: Scan each edge u → v. If comp[u] ≠ comp[v], add an edge comp[u] → comp[v]. To argue it’s a DAG, use the fact that a cycle among components would imply mutual reachability across components.
Algorithm (O(V + E)):
This is O(V + E) because each edge is processed once.
Why it’s a DAG:
So no cycles exist; it is a DAG.
Prerequisite refreshers: Depth-First Search (DFS), Breadth-First Search (BFS), Reachability in Directed Graphs
Next-step nodes that commonly use SCCs: Topological Sorting, Directed Acyclic Graphs (DAGs), 2-SAT via Implication Graph, Deadlock Detection, Graph Condensation