Overlapping Subproblems + Optimal Substructure
Dynamic programming (DP) applies when a problem has two properties: overlapping subproblems (the same smaller problems recur many times) and optimal substructure (an optimal solution can be built from optimal solutions to subproblems).
Without overlapping subproblems, you just have divide-and-conquer (e.g., merge sort). Without optimal substructure, DP cannot guarantee correctness (e.g., longest simple path in a general graph).
fib(3) hundreds of times for large inputs. DP computes it once and reuses the result, dropping exponential time to linear.
Fibonacci: The Canonical Example
The naive recursive Fibonacci runs in O(2^n) because it recomputes the same values. With DP, we store each result and achieve O(n) time.
# Naive recursive - O(2^n)
def fib(n):
if n <= 1: return n
return fib(n-1) + fib(n-2)
# DP bottom-up - O(n) time, O(n) space
def fib_dp(n):
dp = [0] * (n + 1)
dp[1] = 1
for i in range(2, n + 1):
dp[i] = dp[i-1] + dp[i-2]
return dp[n]
# Space-optimized - O(n) time, O(1) space
def fib_opt(n):
if n <= 1: return n
a, b = 0, 1
for _ in range(2, n + 1):
a, b = b, a + b
return b
Top-Down vs Bottom-Up
There are two ways to implement DP. Top-down (memoization) starts from the original problem and recurses, caching results. Bottom-up (tabulation) fills a table iteratively from the smallest subproblems upward.
| Property | Top-Down (Memoization) | Bottom-Up (Tabulation) |
|---|---|---|
| Implementation | Recursive + cache (dict/array) | Iterative loops filling a table |
| Subproblems solved | Only reachable ones (lazy) | All subproblems (eager) |
| Stack overhead | Recursion depth can cause stack overflow | No recursion; constant stack |
| Space optimization | Hard to apply rolling arrays | Easy to reduce to O(n) or O(1) |
| Ease of writing | Often more natural; mirrors recurrence | Requires careful ordering of loops |
| Best when | Sparse subproblem space; tree-shaped | Dense subproblem space; need space optimization |
# Top-down: memoization with decorator
from functools import lru_cache
@lru_cache(maxsize=None)
def climb_stairs(n):
"""Number of ways to climb n stairs (1 or 2 steps at a time)."""
if n <= 2: return n
return climb_stairs(n-1) + climb_stairs(n-2)
# Bottom-up: tabulation
def climb_stairs_bu(n):
if n <= 2: return n
dp = [0] * (n + 1)
dp[1], dp[2] = 1, 2
for i in range(3, n + 1):
dp[i] = dp[i-1] + dp[i-2]
return dp[n]
Coin Change, Knapsack, and More
Minimum coins to make amount. Unbounded: each coin reusable. dp[i] = min(dp[i], dp[i - coin] + 1)
Max value with weight limit. Each item used at most once. 2D state: dp[i][w].
Like 0/1 but items are reusable. Inner loop direction changes from right-to-left to left-to-right.
Count ways to reach step n taking 1 or 2 steps. Identical structure to Fibonacci.
Coin Change (Minimum Coins)
def coin_change(coins, amount):
dp = [float('inf')] * (amount + 1)
dp[0] = 0
for coin in coins:
for i in range(coin, amount + 1):
dp[i] = min(dp[i], dp[i - coin] + 1)
return dp[amount] if dp[amount] != float('inf') else -1
0/1 Knapsack
def knapsack_01(weights, values, capacity):
n = len(weights)
dp = [0] * (capacity + 1)
for i in range(n):
# Traverse RIGHT to LEFT to ensure each item used once
for w in range(capacity, weights[i] - 1, -1):
dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
return dp[capacity]
LCS, Edit Distance, Palindromes
String DP problems typically define states on indices into one or two strings. The table is usually 2D, with transitions that match, insert, delete, or replace characters.
Longest Common Subsequence (LCS)
Given strings X and Y, find the longest subsequence common to both. A subsequence need not be contiguous.
def lcs(X, Y):
m, n = len(X), len(Y)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
if X[i-1] == Y[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
return dp[m][n]
# Time: O(m*n), Space: O(m*n) — reducible to O(n)
Edit Distance (Levenshtein)
Minimum insertions, deletions, and replacements to transform one string into another.
def edit_distance(s1, s2):
m, n = len(s1), len(s2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(m + 1): dp[i][0] = i
for j in range(n + 1): dp[0][j] = j
for i in range(1, m + 1):
for j in range(1, n + 1):
if s1[i-1] == s2[j-1]:
dp[i][j] = dp[i-1][j-1]
else:
dp[i][j] = 1 + min(
dp[i-1][j], # delete
dp[i][j-1], # insert
dp[i-1][j-1] # replace
)
return dp[m][n]
Longest Palindromic Subsequence
The LPS of a string is the LCS of the string with its reverse. Alternatively, define dp[i][j] as the LPS length for substring s[i..j].
def longest_palindromic_subseq(s):
n = len(s)
dp = [[0] * n for _ in range(n)]
for i in range(n): dp[i][i] = 1
for length in range(2, n + 1):
for i in range(n - length + 1):
j = i + length - 1
if s[i] == s[j]:
dp[i][j] = dp[i+1][j-1] + 2
else:
dp[i][j] = max(dp[i+1][j], dp[i][j-1])
return dp[0][n-1]
Unique Paths and Minimum Cost
Grid problems define states as dp[r][c] representing a value at row r, column c. Transitions usually come from adjacent cells (above and left for top-left-to-bottom-right traversals).
Unique Paths
Count paths from top-left to bottom-right in an m x n grid, moving only right or down.
def unique_paths(m, n):
dp = [1] * n # first row: only one way to reach any cell
for r in range(1, m):
for c in range(1, n):
dp[c] += dp[c-1] # dp[c] (from above) + dp[c-1] (from left)
return dp[n-1]
# O(m*n) time, O(n) space using rolling array
Minimum Path Sum
Find the path from top-left to bottom-right that minimizes the sum of values along the way.
def min_path_sum(grid):
m, n = len(grid), len(grid[0])
dp = [float('inf')] * n
dp[0] = 0
for r in range(m):
dp[0] += grid[r][0] if r > 0 else grid[0][0]
for c in range(1, n):
if r == 0:
dp[c] = dp[c-1] + grid[r][c]
else:
dp[c] = min(dp[c], dp[c-1]) + grid[r][c]
return dp[n-1]
Space Optimization & DP vs Greedy
State Design
The hardest part of DP is defining the state. Ask: what information do I need to make a decision at each step? Common state dimensions include index, remaining capacity, boolean flags, and partial results. Minimize state dimensions to keep the table small.
Rolling Array (Space Optimization)
When dp[i] only depends on dp[i-1] (the previous row), you can drop the first dimension and reuse a single array, reducing O(m*n) space to O(n).
# LCS with O(n) space instead of O(m*n)
def lcs_optimized(X, Y):
m, n = len(X), len(Y)
prev = [0] * (n + 1)
for i in range(1, m + 1):
curr = [0] * (n + 1)
for j in range(1, n + 1):
if X[i-1] == Y[j-1]:
curr[j] = prev[j-1] + 1
else:
curr[j] = max(prev[j], curr[j-1])
prev = curr
return prev[n]
When DP vs Greedy
Greedy makes a locally optimal choice and never reconsiders. It works when the problem has the greedy choice property: a locally optimal choice leads to a globally optimal solution. DP is needed when local choices interact and you must consider all combinations.
| Problem | Greedy Works? | Why |
|---|---|---|
| Activity selection | Yes | Choosing earliest finish time is provably optimal |
| Fractional knapsack | Yes | Take highest value/weight ratio first |
| 0/1 Knapsack | No | Can't take fractions; greedy by ratio fails |
| Coin change (arbitrary denominations) | No | Greedy (largest first) fails for e.g. coins=[1,3,4], amount=6 |
| Coin change (canonical systems like USD) | Yes | Denomination structure guarantees greedy works |
| Huffman coding | Yes | Greedy merge of two smallest frequencies is optimal |