Divide-and-Conquer (Ch. 4)
- Recurrences. Typical \(T(n)=aT(n/b)+f(n)\). Methods: substitution, recursion tree, master, Akra–Bazzi.Akra–Bazzi: \(T(n)=T(n/2)+T(n/3)+n\). Tight bound \(\Theta(n)\) Master Theorem: \(T(n)=4T(n/2)+n^2\log n\) \(\Theta(n^2\log^2 n)\)
- Master Theorem. Compare \(f(n)\) vs \(n^{\log_b a}\).
- Case 1: \(f(n)=O(n^{\log_b a-\varepsilon})\Rightarrow T(n)=\Theta(n^{\log_b a})\).
- Case 2: \(f(n)=\Theta(n^{\log_b a}\log^k n)\Rightarrow T(n)=\Theta(n^{\log_b a}\log^{k+1}\!n)\).
- Case 3: \(f(n)=\Omega(n^{\log_b a+\varepsilon})\) & regularity \(a f(n/b)\le c f(n)\) (some \(c<1\)) \(\Rightarrow T(n)=\Theta(f(n))\).
- Recursion Tree. Sum cost/level × \(\#\)levels \(\approx \log_b n\); check leaves vs internal.
- Akra–Bazzi (handy form). If \(\sum a_i b_i^{-p}=1\) in \(T(n)=\sum a_i T(n/b_i)+g(n)\), then
\(T(n)=\Theta\!\big(n^p(1+\int^n g(u)u^{-p-1}du)\big)\).
- Classic results. Merge sort \(=\Theta(n\log n)\); Strassen \(=\Theta(n^{\log_2 7})\); binary search \(=\Theta(\log n)\).
Quicksort (7.1–7.4) & Sorting LB (8.1)
- Partition invariants. After
PARTITION, \(A[p..q\!-\!1]\le A[q]\le A[q\!+\!1..r]\).
- Complexities. Worst \( \Theta(n^2)\) (bad splits); expected \( \Theta(n\log n)\) (random pivot); in-place; usually not stable.
- Randomized quicksort. Choose random pivot; expected good balance → \( \mathbb{E}[T(n)]=\Theta(n\log n)\).Randomized quicksort:3-way partition runs \(\Theta(n)\) when all keys are equal
- Decision-tree lower bound. Any comparison sort needs \(\Omega(n\log n)\) comparisons.
- Linear-time sorts context. Counting/radix/bucket escape the comparison LB when keys are structured/bounded.
Multiset sorting info-theoretic lower bound \(\Omega\!\big(\log \frac{n!}{\prod_i m_i!}\big)\)
Tips: Tail-recursion eliminate on larger side; median-of-three; 3-way partition to handle equal keys.
Order Statistics (9.1–9.3)
- RANDOMIZED-SELECT. Partition; recurse only one side. Avg. \(T(n)=T(n/2)+\Theta(n)=\Theta(n)\); worst \( \Theta(n^2)\).
- SELECT (Median-of-Medians). Groups of 5 → medians array → pivot = median of medians; guarantees \(3n/10\) vs \(7n/10\) split ⇒
\(T(n)=T(\lceil n/5\rceil)+T(7n/10)+\Theta(n)=\Theta(n)\).
Deterministic SELECT with groups of 3 \(\Theta(n\log n)\) RANDOMIZED-SELECT: larger side size \(\ge 3n/4\) occurs with prob. roughly \(1/2\) About half the pivots are in the middle half ⇒ bad split \(\ge 3n/4\) is \(\approx 1/2\).
Dynamic Programming (14.1–14.5)
- Template: define subproblems → recurrence → eval order (DAG) → fill → reconstruct.
- Rod cutting. \(R(n)=\max_{1\le i\le n}\{p_i+R(n-i)\}\).
- Matrix-chain. \(m[i,j]=\min_{i\le k<j}\{m[i,k]+m[k+1,j]+p_{i-1}p_kp_j\}\) with split table.
- LCS. \(c[i,j]=\begin{cases}c[i-1,j-1]+1,&x_i=y_j\\ \max(c[i-1,j],c[i,j-1]),&\text{else}\end{cases}\).
- 0/1 Knapsack. \(DP[i,w]=\max(DP[i-1,w],\,v_i+DP[i-1,w-w_i])\).
LCS properties Computing one LCS is \(O(mn)\); counting all distinct LCSs may be exponential Matrix-chain complexity Time \(O(n^3)\), space \(O(n^2)\)Rod cutting with cost \(c\) per cut \(R(n)=\max_{1\le i\le n}\{p_i+R(n-i)-c\cdot\mathbf{1}[i<n]\}\)
Greedy (15.1–15.3)
- Greedy-choice property. A locally optimal first step can be extended to a global optimum.
- Exchange arguments. Transform an optimal solution to match greedy’s first choice without hurting optimality.
- Activity selection. Sort by earliest finish; iteratively take the first compatible.
- Huffman coding. Combine two least frequencies repeatedly → optimal prefix code.
Activity selection — correct rule Finish times
Huffman coding correctness uses Both via exchange arguments
Amortized Analysis (16.1–16.4)
- Aggregate. Total cost of \(n\) ops → average.
- Accounting. Overcharge cheap ops to save “credits”.
- Potential. \(\hat c_i=c_i+\Phi(D_i)-\Phi(D_{i-1})\) with \(\Phi\ge0,\ \Phi(D_0)=0\).
- Examples. Binary counter (amort. \(O(1)\)); dynamic tables with expand/contract thresholds (amort. \(O(1)\)).
Binary counter bit flips over \(n\) increments \(\le 2n\)Potential method — amortized cost \( \hat c_i=c_i+\Phi(D_i)-\Phi(D_{i-1})\) with \(\Phi\ge0,\ \Phi(D_0)=0\)
Disjoint Sets (19.1)
- Ops:
MAKE-SET(x), FIND-SET(x), UNION(x,y).
- Union-by-rank (or size) + path compression ⇒ amortized \(O(\alpha(n))\).
- Use in Kruskal, connectivity, offline queries.
Union-Find (rank + path compression) amortized time per op \(O(\alpha(n))\)
Elementary Graph Algorithms (20.1–20.5)
- Representations. Adjacency list (sparse), adjacency matrix (dense).
- BFS. Queue; layers; shortest paths for unweighted; colors white→gray→black; \(O(V+E)\).
- DFS. Discovery/finish times \(d[u],f[u]\); edge types (tree/back/forward/cross); cycle detection; \(O(V+E)\).
- Topological sort. DFS postorder (reverse finish) on DAG.
- SCCs. DFS on \(G^T\) in decreasing \(f[u]\) order (Kosaraju); or Tarjan’s stack algorithm.
DFS edge types in undirected graphs Tree and back edges onlyA directed graph is a DAG iff DFS discovers No back edges
Kosaraju’s algorithm DFS on \(G\) for finish order; DFS on \(G^T\) in decreasing finish order
Minimum Spanning Trees (21.1–21.2)
- Cut property. For any cut, the light edge crossing it is safe.
- Kruskal. Sort edges; add if endpoints in different sets (DSU). \(O(E\log E)\).
- Prim. Grow from a root using min-key edges; binary heap \(O(E\log V)\); Fibonacci heap \(O(E+V\log V)\).
MST uniqueness statements If every cut has a unique light edge, MST is uniqueEdge present in every MST if Unique lightest across some cut
Single-Source Shortest Paths (22.1–22.4)
- Relaxation. if \(d[v]>d[u]+w(u,v)\) then update \(d[v],\pi[v]\); distances monotonically decrease.
- BFS for unit weights; DAG-SSSP via topological order.
- Bellman–Ford. \(V-1\) passes; detects negative cycles; works with negative edges.
- Dijkstra. Nonnegative edges; min-priority queue; \(O(E\log V)\).
- Difference constraints. Build constraint graph; BF detects infeasibility (neg. cycle).
When does Dijkstra fail?There is a negative-weight edgeDijkstra with Fibonacci heap \(O(E+V\log V)\)Difference constraints feasibility No negative cycles
All-Pairs Shortest Paths (22.5 proofs; 23.1–23.2)
- Floyd–Warshall. \(D^{(k)}[i,j]=\min\{D^{(k-1)}[i,j],\,D^{(k-1)}[i,k]+D^{(k-1)}[k,j]\}\). \(O(V^3)\).
- Johnson’s. Add super-source, run BF to get potential \(h\); reweight \(w'(u,v)=w(u,v)+h(u)-h(v)\ge0\); then \(V\) runs of Dijkstra; restore distances.
Floyd–Warshall detects negative cycles by Some \(d[i][i]<0\) at endJohnson’s algorithm uses potentials \(h\) from Bellman–Ford from super-source
Maximum Flow (24.1–24.3)
- Flow network. Capacity constraints; conservation at \(V\setminus\{s,t\}\).
- Residual graph. \(c_f(u,v)=c(u,v)-f(u,v)\); include back-edges of capacity \(f(u,v)\).
- Ford–Fulkerson. Augment along any residual \(s\!\to\!t\) path until none exist.
- Edmonds–Karp. BFS (shortest in edges) augmenting paths ⇒ \(O(VE^2)\).
- Max-Flow Min-Cut. Max flow value = min cut capacity.
Edmonds–Karp running time \(O(VE^2)\)Residual reachability set at termination Minimum \(s\)-\(t\) cutKey monotonicity in E–K analysis Shortest augmenting path length never decreases
Linear Programming (29.1–29.3)
- Standard form. \(\max c^Tx\) s.t. \(Ax\le b,\ x\ge0\). Equalities/≥ via slack/surplus and sign-splits.
- Geometry. Optima at vertices (basic feasible solutions); simplex walks vertices.
- Duality. Dual of \(\max\{c^Tx:Ax\le b,x\ge0\}\) is \(\min\{b^Ty:A^Ty\ge c,y\ge0\}\). Strong duality & complementary slackness.
Dual of \(\max\{c^\top x: Ax\le b,\ x\ge0\}\) \(\min\{b^\top y: A^\top y \ge c,\ y\ge0\}\)Complementary slackness If \(x_j>0\) then \((A^\top y)_j=c_j\); if \(y_i>0\) then \((Ax)_i=b_i\)
String Matching (32.1–32.2)
- Naïve. Slide pattern; worst \(O((n-m+1)m)\).
- Rabin–Karp. Rolling hash: \(H_{i+1}=(d(H_i - s_i d^{m-1}) + s_{i+m}) \bmod q\). Verify on hash match. Avg. near \(O(n+m)\), worst \(O(nm)\).
Rabin–Karp expected running time (with verification) \(O(n+m)\)
NP-Completeness (34.1–34.5)
- Classes. \(P\): poly-time solvable; \(NP\): poly-time verifiable; \(coNP\): complements of \(NP\).
- Reductions. \(A\le_p B\): poly-time \(f\) with \(x\in A \iff f(x)\in B\). If \(B\in P\) and \(A\le_p B\) then \(A\in P\).
- NP-hard / NP-complete. NP-hard if all \(NP\) reduces to it; NP-complete if also in \(NP\).
- Cook-Levin & friends. SAT NP-complete → reductions to 3SAT, CLIQUE, Vertex-Cover, Independent-Set, Hamiltonian-Cycle, Subset-Sum, etc.
- Proof recipe. Choose known NP-complete \(A\) → poly-time map to your problem \(B\) → prove iff → conclude.
If \(A \le_p B\) and \(A\) is NP-hard, then \(B\) is NP-hardSince CLIQUE is NP-complete, which implication holds? If CLIQUE \(\in P\) then \(P=NP\)
Mini Derivations & Identities
- Decision-tree LB. Leaves ≥ \(n!\); \(2^h\ge n!\Rightarrow h\ge \log_2(n!)=\Theta(n\log n)\).
- Median-of-Medians split. ≥ half medians ≥ pivot, and ≥ half those come from groups with ≥3 elements ≥ pivot ⇒ ≥\(3n/10\) elements ≥ pivot.
- F-W. Either best path avoids \(k\) or uses \(i\to k\to j\) through \(k\).
- Johnson reweighting. \(w'(u,v)=w(u,v)+h(u)-h(v)\ge0\); shortest paths preserved; recover \(d(u,v)=d'(u,v)-h(u)+h(v)\).
- LP duality snapshot. Primal “≤” constraints ↔ dual “≥” with \(A^T\), \(b\leftrightarrow c\).
Disjoint Sets (19.1)
- ADT.
MAKE-SET(x), FIND-SET(x) (return rep), UNION(x,y) (merge by reps).
- Forest impl. Parent pointers; each set as a rooted tree.
- Heuristics. Union-by-rank/size (attach shorter to taller) + Path compression on
FIND-SET.
- Complexity. Any sequence of \(m\) ops on \(n\) elements: \(O(m\,\alpha(n))\), where \(\alpha\) is inverse Ackermann (practically <= 4).
- Use cases. Kruskal’s MST; connectivity queries; offline dynamic connectivity; checking cycle creation.
- Proof sketch. Tarjan’s analysis with rank invariant: rank nondecreasing; #nodes of rank \(r\) ≤ \(n/2^r\); compression charges to ranks ⇒ \(\alpha(n)\).
Elementary Graph Algorithms (20.1–20.5)
- Representations. List: \(O(V+E)\) space; Matrix: \(O(V^2)\). Iteration cost: neighbors \(O(\deg v)\) vs \(O(V)\).
- BFS. Queue; discovers layers; computes shortest paths for unit-weight graphs. Time \(O(V+E)\).
- Invariant. When a vertex \(u\) is dequeued, \(d[u]\) is final shortest distance.
- Tree edges. to newly discovered whites; gray/black logic prevents re-enqueue.
- DFS. Stack/recursion; timestamps \(d[u], f[u]\); classifies directed edges:
tree, back (to ancestor), forward (to descendant), cross (other). Time \(O(V+E)\).
- Topological sort. Valid iff DAG; order = decreasing \(f[u]\) (postorder).
- SCCs. Kosaraju: DFS on \(G\) to get stack by finish times; DFS on \(G^T\) popping order. Tarjan: one pass with lowlink; stack membership.
- Connectivity/Bipartiteness. BFS layers parity; odd cycle iff non-bipartite.
Minimum Spanning Trees (21.1–21.2)
- Goal. Connect all vertices minimizing total edge weight in connected, undirected, weighted graph.
- Cut Property. For any cut, the minimum-weight edge crossing it is safe (belongs to some MST).
- Cycle Property. For any cycle, the maximum-weight edge on that cycle is not in any MST.
- Uniqueness. If every cut has a unique light edge ⇒ unique MST. Distinct weights suffice (but not necessary).
- Kruskal. Sort edges ↑ weight; add if endpoints in different DSU sets. \(O(E\log E)\). Proof: safe edges via cut property.
- Prim. Grow a tree from root; maintain crossing edges in PQ.
- Binary heap: \(O(E\log V)\); Fibonacci heap: \(O(E+V\log V)\).
- Dense graphs: adjacency matrix + array gives \(O(V^2)\).
- Pitfalls. Graph must be connected (else compute MSF per component). Negative weights okay.
Single‑Source Shortest Paths (22.1–22.4)
- Relaxation. If \(d[v] > d[u]+w(u,v)\): set \(d[v]=d[u]+w(u,v)\), \(\pi[v]=u\). Distances monotonically decrease.
- BFS (unweighted). Unit weights ⇒ breadth layers give shortest paths in \(O(V+E)\).
- DAG‑SSSP. Topological order; relax edges once. \(O(V+E)\). Works with negative edges (no cycles).
- Bellman–Ford. Repeat \(V-1\) full relaxations; detect negative cycles if any edge relaxes in \(V\)-th pass. \(O(VE)\).
- Dijkstra. Nonnegative weights; PQ extract‑min settles vertex with final distance.
- Binary heap: \(O(E\log V)\); Fibonacci: \(O(E+V\log V)\); array: \(O(V^2)\) (dense).
- Counterexample. Fails with negative edges (even if no neg cycles).
- Difference constraints. Constraints \(x_v-x_u \le c_{uv}\). Build edges \(u\to v\) with weight \(c_{uv}\).
Feasible iff no negative cycle; potentials from BF give one feasible solution.
All‑Pairs Shortest Paths (22.5; 23.1–23.2)
- Floyd–Warshall. DP over allowed intermediates \(k\):
\(D^{(k)}[i,j]=\min\{D^{(k-1)}[i,j],\,D^{(k-1)}[i,k]+D^{(k-1)}[k,j]\}\). Runtime \(O(V^3)\), space \(O(V^2)\).
- Negative cycle detection: some \(D[i,i] < 0\) at end.
- Johnson’s algorithm. Add super‑source \(s\) to all \(v\); run BF to compute \(h(v)=\delta(s,v)\); reweight
\(w'(u,v)=w(u,v)+h(u)-h(v)\ge 0\). Then Dijkstra from each source; recover original distances \(d(u,v)=d'(u,v)-h(u)+h(v)\).
- Runtime \(O(VE + V E\log V)\) with heaps; handles negative edges, no neg cycles.
Maximum Flow (24.1–24.3)
- Flow network. Directed graph with capacities \(c(u,v)\); flow \(f\) satisfies capacity and conservation.
- Residual graph. \(c_f(u,v)=c(u,v)-f(u,v)\) (forward) and \(c_f(v,u)=f(u,v)\) (backward).
- Ford–Fulkerson. While there exists augmenting path \(P\) in residual, augment by \(\min\) residual on \(P\).
- Edmonds–Karp. Choose shortest (in edges) augmenting path via BFS ⇒ each edge becomes critical \(O(V)\) times ⇒ \(O(VE^2)\).
- Max‑Flow Min‑Cut Theorem. Value of max flow equals capacity of min \(s\)–\(t\) cut. At termination, vertices reachable from \(s\) in residual form the min‑cut side \(S\).
- Common reductions. Edge/vertex disjoint paths; bipartite matching (convert to flow); project selection; circulation with demands.
- Pitfalls. Irrational capacities + naive augmenting choices can cycle; EK avoids by BFS choice.
Linear Programming (29.1–29.3)
- Standard form. \(\max c^\top x\) s.t. \(Ax \le b,\ x\ge 0\). Convert \(=\) or \(\ge\) via slack/surplus and sign splits.
- Geometry. Feasible region is a polyhedron; optima occur at vertices (basic feasible solutions, BFSs).
- Duality. Dual of standard max: \(\min\{b^\top y: A^\top y \ge c,\ y\ge0\}\). Strong duality holds; weak duality always.
- Complementary slackness. For primal \(x^*\), dual \(y^*\):
- If \(x_j^*>0\) then \((A^\top y^*)_j=c_j\).
- If \(y_i^*>0\) then \((Ax^*)_i=b_i\).
- Modeling motifs. Assignment & bipartite matching as LPs (totally unimodular ⇒ integral BFS); min‑cut dual to max‑flow in certain formulations.
- Algorithms. Simplex (often fast, worst‑case exponential); Ellipsoid/Interior‑Point (polynomial time).
String Matching (32.1–32.2)
- Naïve. Slide pattern \(P\) over text \(T\); worst \(O((n-m+1)m)\).
- Rabin–Karp. Rolling hash with base \(d\) and modulus \(q\):
\(H_{i+1}=(d(H_i - T_i d^{m-1}) + T_{i+m}) \bmod q\). Verify on hash match.
Expected \(O(n+m)\); worst \(O(nm)\) (adversarial collisions).
- Use cases. Multiple pattern search with sum of \(m\) over patterns; plagiarism detection; fast filters.
NP‑Completeness (34.1–34.5)
- Classes. \(P\): solvable in poly‑time. \(NP\): verifiable in poly‑time. \(coNP\): complements of \(NP\).
- Reductions. \(A \le_p B\): poly‑time map \(f\) with \(x\in A \iff f(x)\in B\). If \(A\le_p B\) and \(B\in P\) ⇒ \(A\in P\).
- NP‑hard / NP‑complete. \(L\) NP‑hard if every \(L'\in NP\) reduces to \(L\). NP‑complete if NP‑hard and \(L\in NP\).
- Canonical NP‑complete problems. SAT → 3SAT → CLIQUE / Vertex‑Cover / Independent‑Set; Hamiltonian‑Cycle; Subset‑Sum; Partition; 3‑Coloring.
- Proof recipe (exam routine).
- Choose known NP‑complete problem \(A\).
- Give poly‑time mapping \(f\) to your target \(B\).
- Prove correctness: \(x\in A \iff f(x)\in B\).
- Conclude \(B\) NP‑complete (and \(B\in NP\)).
- Implications. If any NP‑complete problem is in \(P\), then \(P=NP\). Otherwise unknown.
Exam Patterns & Proof Sketches
- MST uniqueness. Show every cut has unique light edge ⇒ that edge must be in every MST ⇒ uniqueness.
- Dijkstra correctness. Loop invariant: extracted \(u\) has \(d[u]=\delta(s,u)\); proof by contradiction using nonnegativity.
- EK runtime. Distances in residual (in edge count) to \(t\) are nondecreasing; each edge saturates \(O(V)\) times.
- FW recurrence. “Allow intermediates among \(\{1..k\}\)”: either route avoids \(k\) or goes \(i\to k\to j\).
- LP CS conditions. Use to certify optimality: provide primal/dual feasible pair meeting CS.
- NP‑C proof. Always check membership in \(NP\) + reduction direction is from known NP‑C to your problem.
Complexities Quick Table
| Topic | Algorithm | Time | Notes |
| Quicksort | Randomized | \(\mathbb{E}\,\Theta(n\log n)\) | In-place; worst \(n^2\) |
| Order stat. | Randomized-Select | \(\mathbb{E}\,\Theta(n)\) | Partition-based |
| Order stat. | Median-of-Medians | \(\Theta(n)\) | Worst-case linear |
| DP | LCS | \(O(mn)\) | Backtrack to reconstruct |
| Greedy | Huffman | \(O(n\log n)\) | Min-heap |
| Amortized | Binary counter | \(O(1)\) amort. | Bit flips total \(O(n)\) |
| Union-Find | Rank+Compression | \(O(\alpha(n))\) amort. | Almost constant |
| Graphs | BFS/DFS | \(O(V+E)\) | Adjacency lists |
| MST | Kruskal | \(O(E\log V)\) | Sort + DSU |
| SSSP | Dijkstra | \(O(E\log V)\) | Nonneg. weights |
| APSP | Floyd–Warshall | \(O(V^3)\) | Dense graphs |
| Flow | Edmonds–Karp | \(O(VE^2)\) | BFS augment paths |
| LP | Simplex | (Often fast) | Worst-case exponential |
| Strings | Rabin–Karp | Avg. \(O(n+m)\) | Rolling hash; verify |