Graphs & Graph Algorithms — learn.surkar.in

01 / Fundamentals

Graph Representations & Terminology

A graph G = (V, E) consists of vertices (nodes) and edges (connections). Graphs can be directed (edges have direction) or undirected, and weighted (edges carry a cost) or unweighted. A DAG (Directed Acyclic Graph) is a directed graph with no cycles -- critical for dependency resolution.

Degree

In an undirected graph, the degree of a vertex is the number of edges incident to it. In directed graphs, we distinguish in-degree (incoming edges) and out-degree (outgoing edges). The sum of all degrees always equals 2|E|.

Representation	Space	Edge Lookup	Iterate Neighbors	Best For
Adjacency List	O(V + E)	O(degree)	O(degree)	Sparse graphs (most real-world)
Adjacency Matrix	O(V²)	O(1)	O(V)	Dense graphs, quick edge queries
Edge List	O(E)	O(E)	O(E)	Kruskal's, simple storage

# Adjacency List (dict of lists)
graph = {
    'A': ['B', 'C'],
    'B': ['A', 'D'],
    'C': ['A', 'D', 'E'],
    'D': ['B', 'C'],
    'E': ['C']
}

# Adjacency Matrix (2D array)
#     A  B  C  D  E
# A [ 0, 1, 1, 0, 0 ]
# B [ 1, 0, 0, 1, 0 ]
# C [ 1, 0, 0, 1, 1 ]
# D [ 0, 1, 1, 0, 0 ]
# E [ 0, 0, 1, 0, 0 ]

# Edge List
edges = [('A','B'), ('A','C'), ('B','D'), ('C','D'), ('C','E')]

Key Insight

Use adjacency lists by default. Most real-world graphs are sparse (E is much less than V²), so adjacency lists save memory and iterate neighbors faster. Reach for a matrix only when you need O(1) edge existence checks on dense graphs.

02 / BFS

Breadth-First Search

BFS explores a graph level by level using a queue. It visits all neighbors of the current node before moving deeper. Time complexity is O(V + E).

BFS Traversal Order

Start Node

→

All depth-1 neighbors

→

All depth-2 neighbors

→

...

Shortest Path in Unweighted Graphs

BFS naturally finds the shortest path (fewest edges) from a source to any reachable vertex. The first time BFS reaches a node, that path is guaranteed to be shortest.

from collections import deque

def bfs_shortest_path(graph, start, target):
    queue = deque([(start, [start])])
    visited = {start}

    while queue:
        node, path = queue.popleft()
        if node == target:
            return path
        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append((neighbor, path + [neighbor]))
    return None  # no path exists

Use Cases

Shortest path in unweighted graphs, level-order traversal, finding connected components, web crawlers, social network "degrees of separation."

03 / DFS

Depth-First Search

DFS explores as deep as possible along each branch before backtracking. It uses a stack (or recursion). Time complexity is O(V + E), same as BFS, but the traversal order and applications differ significantly.

def dfs(graph, node, visited=None):
    if visited is None:
        visited = set()
    visited.add(node)
    for neighbor in graph[node]:
        if neighbor not in visited:
            dfs(graph, neighbor, visited)
    return visited

Cycle Detection

In a directed graph, a cycle exists if DFS encounters a node that is currently on the recursion stack (a "back edge"). In undirected graphs, a cycle exists if DFS visits an already-visited node that is not the parent.

Connected Components

Run DFS (or BFS) from each unvisited node. Each run discovers one connected component. For directed graphs, finding strongly connected components requires Tarjan's or Kosaraju's algorithm.

DFS vs BFS

DFS uses less memory on wide graphs (stack depth = longest path), while BFS uses less memory on deep, narrow graphs. DFS is preferred for topological sorting, cycle detection, and pathfinding in mazes. BFS is preferred for shortest paths and level-order work.

04 / Shortest Paths

Dijkstra & Bellman-Ford

Dijkstra's Algorithm

Single-source shortest path for graphs with non-negative edge weights. Uses a min-heap (priority queue) to greedily select the next closest unvisited vertex. Time complexity: O((V + E) log V) with a binary heap.

import heapq

def dijkstra(graph, start):
    dist = {node: float('inf') for node in graph}
    dist[start] = 0
    heap = [(0, start)]

    while heap:
        d, u = heapq.heappop(heap)
        if d > dist[u]:
            continue  # stale entry
        for v, weight in graph[u]:
            if dist[u] + weight < dist[v]:
                dist[v] = dist[u] + weight
                heapq.heappush(heap, (dist[v], v))
    return dist

Warning

Dijkstra fails with negative edge weights. It may settle a node too early, missing a cheaper path through a negative edge discovered later. Use Bellman-Ford instead.

Bellman-Ford Algorithm

Handles negative edge weights and can detect negative-weight cycles. Relaxes all edges V-1 times. Time complexity: O(V * E), slower than Dijkstra but more general.

def bellman_ford(vertices, edges, start):
    dist = {v: float('inf') for v in vertices}
    dist[start] = 0

    for _ in range(len(vertices) - 1):
        for u, v, w in edges:
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w

    # Detect negative-weight cycles
    for u, v, w in edges:
        if dist[u] + w < dist[v]:
            raise ValueError("Negative-weight cycle detected")
    return dist

Property	Dijkstra	Bellman-Ford
Time	O((V+E) log V)	O(V * E)
Negative weights	No	Yes
Negative cycle detection	No	Yes
Approach	Greedy (min-heap)	Dynamic programming
Best for	Sparse, non-negative	Graphs with negative edges

05 / Minimum Spanning Trees

Kruskal's & Prim's

A Minimum Spanning Tree (MST) connects all vertices in an undirected weighted graph with the minimum total edge weight, using exactly V-1 edges and no cycles.

Kruskal's Algorithm (Union-Find)

Sort all edges by weight. Greedily add the cheapest edge that does not form a cycle (checked via Union-Find / Disjoint Set). Time: O(E log E).

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # already connected
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True

def kruskal(n, edges):
    edges.sort(key=lambda e: e[2])  # sort by weight
    uf = UnionFind(n)
    mst = []
    for u, v, w in edges:
        if uf.union(u, v):
            mst.append((u, v, w))
            if len(mst) == n - 1:
                break
    return mst

Prim's Algorithm (Min-Heap)

Start from any vertex, greedily add the cheapest edge connecting the MST to a non-MST vertex. Uses a min-heap. Time: O((V + E) log V).

Property	Kruskal's	Prim's
Approach	Edge-centric, global sort	Vertex-centric, grow from seed
Data structure	Union-Find	Min-Heap
Time	O(E log E)	O((V+E) log V)
Better for	Sparse graphs	Dense graphs

Practical Note

Both Kruskal's and Prim's produce the same MST if all edge weights are distinct. When weights are not unique, multiple valid MSTs may exist. The Union-Find data structure from Kruskal's is also invaluable for detecting connected components efficiently.

06 / Topological Sort & Advanced

Topological Sort & Beyond

Topological Sort

A linear ordering of vertices in a DAG such that for every directed edge u → v, u comes before v. Used in build systems, task scheduling, course prerequisites, and dependency resolution. Two approaches: DFS-based (reverse post-order) and Kahn's algorithm (BFS with in-degree tracking).

# Kahn's Algorithm (BFS-based)
from collections import deque

def topological_sort(graph, in_degree):
    queue = deque([v for v in graph if in_degree[v] == 0])
    result = []

    while queue:
        node = queue.popleft()
        result.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    if len(result) != len(graph):
        raise ValueError("Cycle detected -- not a DAG")
    return result

Topological Sort Example (Build Dependencies)

utils.c

→

parser.c

→

compiler.c

→

main.c

Advanced Algorithms

A* Search

Dijkstra + heuristic function h(n) estimating cost to goal. Explores fewer nodes when the heuristic is admissible (never overestimates). Used in game pathfinding and GPS navigation.

Floyd-Warshall

All-pairs shortest paths in O(V³). Works with negative weights (no negative cycles). Simple triple-nested loop using dynamic programming. Good for small, dense graphs.

Tarjan's SCC

Finds all Strongly Connected Components in O(V+E) using a single DFS pass. Each SCC is a maximal set of vertices where every vertex is reachable from every other.

Max Flow (Ford-Fulkerson)

Finds maximum flow through a network from source to sink. Applications: bipartite matching, network capacity, image segmentation. Edmonds-Karp variant runs in O(V * E²).

When to Use What

Single-source, non-negative weights → Dijkstra. Negative weights → Bellman-Ford. All-pairs on small graphs → Floyd-Warshall. Heuristic available → A*. Dependencies → Topological Sort. Network capacity → Max Flow.

Test Yourself

Score: 0 / 10

Question 01

What is the space complexity of an adjacency matrix for a graph with V vertices?

An adjacency matrix stores a V x V grid regardless of the number of edges, so it always uses O(V²) space.

Question 02

Which data structure does BFS use to determine the order of node exploration?

BFS uses a queue (FIFO) to process nodes level by level. DFS uses a stack (or recursion). Dijkstra uses a min-heap.

Question 03

What does DFS detect when it encounters a node still on the recursion stack in a directed graph?

In a directed graph, encountering a node that is currently on the recursion stack means we have found a back edge, which proves a cycle exists.

Question 04

Why does Dijkstra's algorithm fail on graphs with negative edge weights?

Dijkstra greedily finalizes the shortest distance to a node once it is popped from the heap. A negative edge could later provide a shorter path, but the node has already been settled, so the cheaper path is missed.

Question 05

What is the time complexity of the Bellman-Ford algorithm?

Bellman-Ford relaxes all E edges exactly V-1 times, giving O(V * E). This is slower than Dijkstra but handles negative weights.

Question 06

In Kruskal's algorithm, what data structure is used to efficiently detect whether adding an edge would create a cycle?

Union-Find supports near O(1) find and union operations (with path compression and union by rank), making cycle detection in Kruskal's very efficient.

Question 07

A topological sort is only possible on which type of graph?

Topological ordering requires that for every edge u → v, u appears before v. This is impossible if there is a cycle, so the graph must be a DAG (Directed Acyclic Graph).

Question 08

What is the time complexity of Floyd-Warshall for all-pairs shortest paths?

Floyd-Warshall uses three nested loops over all V vertices, giving O(V³). It computes shortest paths between every pair of vertices using dynamic programming.

Question 09

How many edges does a Minimum Spanning Tree have for a connected graph with V vertices?

A spanning tree of a graph with V vertices is a tree, and any tree with V vertices has exactly V - 1 edges. Adding any more would create a cycle.

Question 10

What makes A* search more efficient than Dijkstra's algorithm?

A* adds a heuristic function h(n) that estimates the remaining cost to the goal. When this heuristic is admissible (never overestimates), A* explores fewer nodes than Dijkstra while still finding the optimal path.