Ordered binary trees. O(log n) search, insert, delete.
Self-serve tutorial - low prerequisites, straightforward concepts.
A binary tree becomes dramatically more useful the moment you add one rule: everything smaller goes left, everything larger goes right. That single ordering invariant is what turns “walk the tree” into fast search, insert, and delete—often in O(log n) time.
A Binary Search Tree (BST) is a binary tree where, for every node n, all keys in its left subtree are < key(n) and all keys in its right subtree are > key(n). This invariant lets you search by comparing the target with key(n) and choosing exactly one subtree each step. Insertion and deletion are local pointer updates that preserve the invariant; performance is O(h) where h is the tree height (≈ log₂ n when reasonably balanced, but can degrade to O(n) if the tree becomes a chain).
A plain binary tree gives you structure, but not speed: to find a value, you might have to visit many nodes because there’s no guiding rule about where values live.
A Binary Search Tree (BST) adds a global promise about ordering. With that promise, every comparison tells you which half of the remaining search space to ignore—similar in spirit to binary search on an array, but in a pointer-based tree.
Each node stores a key, written as key(node).
A binary tree is a BST if for every node n:
This is the ordering invariant.
A common way to say it more formally:
For all nodes n:
It’s tempting to think the rule is only about the direct children (left child < parent < right child). But the BST invariant is stronger: it’s about entire subtrees.
Example: if key(n) = 10, then every key anywhere in the left subtree (left child, left-left grandchild, etc.) must be < 10.
If the invariant holds, then:
BST operations take time proportional to the height h (the longest root-to-leaf path):
If the tree is balanced-ish, h ≈ log₂ n, giving O(log n). If it becomes skewed (like a linked list), h ≈ n, giving O(n).
So BSTs are “fast when shaped well,” and this is the main reason balanced BST variants (AVL / Red-Black) exist.
The invariant allows a decision at each node: after comparing the target key k with key(n), you know which subtree could possibly contain k.
If k < key(n), then k cannot be in the right subtree (all keys there are > key(n)).
If k > key(n), then k cannot be in the left subtree (all keys there are < key(n)).
So you follow exactly one pointer per level.
Given a node n (starting at the root) and target k:
Iterative search is often preferred in practice to avoid deep recursion:
Let h be the tree height.
So search cost is O(h).
When the BST is balanced, h ≈ log₂ n, so search is O(log n).
The strict form of the invariant uses < and >, implying no duplicates.
Real implementations must decide what to do with equal keys. Common policies:
| Duplicate Policy | Invariant tweak | Resulting behavior |
|---|---|---|
| Disallow duplicates | keep strict < and > | insert may reject if key exists |
| Put equals on right | left < node ≤ right | duplicates cluster in right subtree |
| Count duplicates | store (key, count) | structure unchanged; counts tracked |
Whatever you choose, the key is consistency: the search rule must match the insertion rule, or you’ll “lose” keys.
The BST invariant implies that an in-order traversal (Left, Node, Right) visits keys in increasing order.
Why? Because:
This property is a big reason BSTs are used for ordered sets/maps.
BST operations feel “global” (you’re maintaining a sorted structure), but the magic is that you can preserve the global invariant with local pointer changes.
You do not reshuffle the entire tree on every insert/delete. You:
That’s the key atomic concept: local updates preserve the invariant.
To insert key k, you search for where k would be found. When the search hits a null child pointer, that null is the correct place to attach a new node.
Because you followed the BST search rule the whole way down, the final position automatically respects all ancestor constraints.
Insertion walks a root-to-leaf path: O(h).
Deletion is the “hard” BST operation because removing a node can disconnect subtrees. The trick is to replace the deleted node with a nearby key that keeps the ordering invariant.
Suppose we want to delete a node n with key(n) = k.
This is the interesting case. If you remove n outright, you’d have two subtrees to reconnect.
Standard solution: replace key(n) with either:
Then delete that successor/predecessor node (which will fall into Case 1 or Case 2).
Let s be the in-order successor of n. Then:
So if you replace key(n) with key(s), you maintain:
And since s is the leftmost node in the right subtree, s has no left child (it might have a right child). That makes deleting s easy (Case 1 or Case 2).
To find successor of node n:
This is O(h) in the worst case, but it’s bounded by the height of the tree.
Delete does:
Overall: O(h).
A common debugging tool is to verify the invariant by tracking allowable key ranges.
At node n, maintain an interval (low, high) such that:
When recursing:
This catches subtle violations that child-only checks miss.
BSTs naturally implement:
Compared to hash tables:
Suppose you want all keys in [a, b]. A BST can do this by:
A hash table can’t do this without scanning everything.
BSTs promise O(log n) only when height h ≈ log₂ n.
But if you insert already-sorted keys into a plain BST, you get a chain:
Then search/insert/delete degrade to O(n).
This is exactly what balanced trees solve: they add rules (and rotations) to keep h bounded by O(log n) regardless of insertion order.
Plain BSTs are still worth learning because:
Start with a BST containing keys: 8 (root), left child 3, right child 10, and 3 has children 1 and 6. Node 6 has children 4 and 7. (This is a classic example BST.)
Tasks:
1) Search for k = 7.
2) Insert k = 5 using the standard BST insertion rule (no duplicates).
1) Search for k = 7
2) Insert k = 5
Check the invariant locally
So the global invariant is preserved.
Insight: Insertion doesn’t require rearranging existing nodes. The search path itself encodes all the constraints from ancestors, so attaching at the first null pointer automatically preserves the BST ordering invariant.
Use the same BST as before (including the inserted key 5). Delete the node with key 3, which has two children (1 and 6). Use the in-order successor method.
Find the node to delete
Identify deletion case
Find in-order successor s of node 3
So successor s has key(s) = 4
Replace key(n) with key(s)
Delete the successor node (original key 4)
So successor deletion is Case 2 (one child)
Sanity check of ordering around the edited area
Invariant preserved.
Insight: Deleting a two-child node is reduced to deleting a one-child/leaf node by swapping with the in-order successor (or predecessor). The successor is chosen specifically because it sits at the boundary between “still bigger than the node” and “as small as possible,” keeping the BST invariant intact.
Consider inserting the same keys {1, 2, 3, 4, 5} into a plain BST in two different orders:
A) 1, 2, 3, 4, 5
B) 3, 1, 4, 2, 5
Compare resulting height h and search cost for key 5.
A) Insert in sorted order: 1,2,3,4,5
Result: a chain leaning right
Height h = 4 (edges) or 5 levels (nodes), depending on definition
Search for 5 visits every node: 1 → 2 → 3 → 4 → 5 ⇒ Θ(n)
B) Insert in mixed order: 3,1,4,2,5
Result: roughly balanced
Longest path: 3 → 1 → 2 or 3 → 4 → 5
Height h = 2 (edges) or 3 levels
Search for 5 visits: 3 → 4 → 5 ⇒ Θ(log n) here
Insight: BST time is O(h), not automatically O(log n). The same set of keys can produce either a fast tree or a slow chain depending on insertion order—this is the motivation for self-balancing BSTs.
A BST is a binary tree with the invariant: ∀n, keys(left subtree) < key(n) < keys(right subtree).
Search follows one path determined by comparisons: k < key(n) ⇒ left, k > key(n) ⇒ right.
Search, insert, and delete are O(h), where h is the tree height; this is ≈ log₂ n only when the tree is balanced-ish.
Insertion attaches a new node at the first null pointer encountered along the search path, preserving the invariant automatically.
Deletion has three cases: leaf (remove), one child (splice), two children (swap with in-order successor/predecessor, then delete that node).
In-order traversal of a BST outputs keys in sorted order.
Handling duplicates requires a consistent policy (reject, route equals to one side, or count duplicates).
Poor insertion order can skew a BST into a chain, motivating balanced BSTs (AVL/Red-Black).
Checking only parent-child ordering (left child < node < right child) instead of enforcing the invariant over entire subtrees.
Implementing duplicates inconsistently (e.g., insertion puts equals left but search assumes equals right), causing failed lookups.
Forgetting to handle the “delete root” case carefully (updating the root pointer when the deleted node is the root).
During two-child deletion, copying the successor key but forgetting to actually delete the successor node (creating a duplicate key).
Given a BST with keys inserted in this order: 10, 5, 15, 3, 7, 12, 18. Trace the search path (list visited keys) for k = 12 and for k = 6.
Hint: At each node, compare k with key(node) and choose exactly one direction. Stop at null if not found.
For k = 12:
Path: 10 → 15 → 12
For k = 6:
Path: 10 → 5 → 7 → null
Delete key 10 (the root) from the BST built by inserting: 10, 5, 15, 3, 7, 12, 18. Use the in-order successor method. What key becomes the new root key, and which node is physically removed in the final step?
Hint: The in-order successor of the root is the leftmost node in the root’s right subtree.
Root key is 10 with two children ⇒ use successor.
Successor is the smallest in right subtree (root.right is 15; leftmost under 15 is 12).
Replace root key 10 with 12. Then delete the successor node (the original node containing key 12).
That successor node is a leaf in this tree, so it is physically removed by setting 15.left = null.
New root key: 12. Physically removed node in final step: the original node with key 12.
Prove (informally) that an in-order traversal of a BST outputs keys in increasing order.
Hint: Use the invariant at a node n: all left keys < key(n) and all right keys > key(n). Then apply the same idea recursively to subtrees.
Consider any node n.
By the BST invariant, every key in the left subtree is < key(n), and every key in the right subtree is > key(n).
An in-order traversal visits: (in-order of left subtree), then n, then (in-order of right subtree).
By recursion, the left subtree’s in-order output is sorted and consists only of keys < key(n).
Similarly, the right subtree’s in-order output is sorted and consists only of keys > key(n).
Concatenating these three sequences yields a globally increasing sequence: all left keys (sorted) < key(n) < all right keys (sorted).