Tree Fundamentals
A tree is a connected, acyclic graph with a distinguished root node. Every node except the root has exactly one parent, and zero or more children. A node with no children is called a leaf.
[A] <-- root (depth 0)
/ \
[B] [C] <-- depth 1
/ \ \
[D] [E] [F] <-- depth 2 (D, E, F are leaves)
| Term | Definition |
|---|---|
| Root | The topmost node with no parent. |
| Parent / Child | A node directly above/below another in the hierarchy. |
| Leaf | A node with zero children. |
| Depth | Number of edges from the root to a given node. |
| Height | Number of edges on the longest path from a node down to a leaf. Height of the tree = height of root. |
| Degree | Number of children a node has. |
| Subtree | A node and all its descendants. |
n nodes always has exactly n - 1 edges. This invariant is useful for quick sanity checks.
BST: Search, Insert, Delete
A Binary Search Tree enforces the invariant: for every node, all values in the left subtree are smaller and all values in the right subtree are larger. This gives O(log n) average-case search, insert, and delete.
[8]
/ \
[3] [10]
/ \ \
[1] [6] [14]
/ \ /
[4] [7] [13]
In-order traversal: 1, 3, 4, 6, 7, 8, 10, 13, 14 (sorted!)
| Operation | Average | Worst (skewed) |
|---|---|---|
| Search | O(log n) | O(n) |
| Insert | O(log n) | O(n) |
| Delete | O(log n) | O(n) |
AVL Trees & Red-Black Trees
Self-balancing BSTs automatically maintain a bounded height after insertions and deletions, guaranteeing O(log n) worst-case operations.
AVL Trees
Strict balance: for every node, the heights of left and right subtrees differ by at most 1. Rebalancing uses rotations (single or double) after each modification. Lookups are slightly faster than Red-Black trees because the tree is more tightly balanced.
Red-Black Trees
Relaxed balance: nodes are colored red or black with rules that ensure the longest path is at most 2x the shortest. Fewer rotations on insert/delete compared to AVL, making writes faster. Used extensively in practice.
| Property | AVL | Red-Black |
|---|---|---|
| Balance strictness | Height diff ≤ 1 | Longest path ≤ 2x shortest |
| Max height | ~1.44 log n | ~2 log n |
| Lookup speed | Slightly faster | Slightly slower |
| Insert/Delete cost | More rotations | Fewer rotations |
| Typical use | Read-heavy workloads | General purpose |
TreeMap / TreeSet, C++ std::map, and the Linux kernel's CFS (Completely Fair Scheduler) which uses an RB-tree to schedule processes by virtual runtime.
Before: After:
[C] [B]
/ / \
[B] [A] [C]
/
[A]
Multi-Way Trees for Disk
B-Trees generalize BSTs by allowing each node to hold multiple keys and have multiple children. They are optimized for systems that read/write large blocks of data — particularly databases and file systems.
B-Tree
A B-Tree of order m allows each node up to m - 1 keys and m children. All leaves are at the same depth, guaranteeing O(log n) operations. The wide branching factor minimizes disk I/O by reducing tree height.
B+ Tree
A variant where all actual data resides in the leaves, and internal nodes store only keys for routing. Leaf nodes are linked together in a doubly-linked list, enabling efficient range scans.
| Property | B-Tree | B+ Tree |
|---|---|---|
| Data location | All nodes | Leaves only |
| Leaf linking | No | Yes (linked list) |
| Range queries | Slower (tree traversal) | Fast (scan linked leaves) |
| Used in | MongoDB (WiredTiger) | MySQL InnoDB, PostgreSQL, NTFS, ext4 |
SELECT ... WHERE x BETWEEN a AND b) sequential reads instead of random seeks. Internal nodes holding only keys means higher fanout and fewer disk pages to traverse.
Heaps & Priority Queues
A heap is a complete binary tree satisfying the heap property: in a min-heap, every parent is smaller than its children; in a max-heap, every parent is larger. Heaps are typically stored as arrays with no pointers needed.
[1]
/ \
[3] [5]
/ \ /
[7] [9] [8]
Parent of i: (i-1)/2
Left child: 2i + 1
Right child: 2i + 2
| Operation | Time | Notes |
|---|---|---|
| Insert | O(log n) | Add at end, bubble up |
| Extract min/max | O(log n) | Swap root with last, sift down |
| Peek min/max | O(1) | Always at index 0 |
| Build heap | O(n) | Bottom-up heapify (Floyd's) |
Applications
The standard implementation for priority queues. Used in Dijkstra's algorithm, task scheduling, and event-driven simulation.
O(n log n) in-place sorting. Build a max-heap, then repeatedly extract the max. Not stable, but guaranteed worst-case performance.
Maintain a min-heap of size K while streaming data. Each new element is compared to the heap root — O(n log K) total.
Specialized Trees
Trie (Prefix Tree)
A trie stores strings character-by-character, with each edge representing a character. Lookup time is O(L) where L is the length of the query string — independent of how many strings are stored.
(root)
/ \
c d
| |
a o
/ \ |
t* r* g*
|
d*
* = end of word
Walk the trie to the prefix node, then enumerate all descendants. Powers search suggestions in IDEs and search engines.
A compressed trie (radix tree) that stores IP prefixes. Routers use it for longest-prefix matching in O(W) time, where W = address width.
Segment Tree
A segment tree builds on an array to answer range queries (sum, min, max, GCD) and perform point/range updates in O(log n). Each node stores the aggregate for a range of the original array.
Fenwick Tree (Binary Indexed Tree)
A more space-efficient alternative to segment trees for prefix sums and point updates. Uses clever bit manipulation — O(log n) for both query and update, but limited to operations with an inverse (like addition).
| Property | Segment Tree | Fenwick Tree |
|---|---|---|
| Range query | O(log n) | O(log n) |
| Point update | O(log n) | O(log n) |
| Range update | O(log n) with lazy propagation | Limited |
| Space | 4n | n |
| Complexity to implement | Moderate | Simple |
When to Use What
| Need | Use | Why |
|---|---|---|
| Sorted data + ordered iteration | BST (balanced) | In-order traversal, predecessor/successor queries |
| Disk-backed sorted index | B+ Tree | Minimizes disk I/O, linked leaves for range scans |
| Fast min/max extraction | Heap | O(1) peek, O(log n) extract, array-backed |
| Prefix search / autocomplete | Trie | O(L) lookup independent of dataset size |
| Key-value with O(1) lookup | Hash Table | Fastest average lookup, but no ordering |
| Range aggregate queries | Segment / Fenwick Tree | O(log n) query and update on array ranges |