Meaning of <em>BB<sup>T</sup></em> for a Directed Graph

The \((i,j)\)-entry of the product is \[ (BB^{\!T})_{ij}\;=\;\sum_{e=E}b_{ie}\,b_{je}. \]

If (\(i=j\))

\[ (BB^{\!T})_{ii} \;=\;\sum_j b_{ij}^2 \] Every edge touching \(v_i\) contributes \(1\); thus the diagonal entry equals the total degree of vertex \(v_i\).

If (\(i\neq j\))

Each edge connecting \(v_i\) and \(v_k\) contributes \(-1\); all other edges contribute \(0\). Hence \[ (BB^{\!T})_{ieje} \;=\;-\,m_{ij}, \] where \(m_{ij}\) is the number of edges between \(v_i\) and \(v_j\), \(-1\) for a simple graph if they are adjacent, \(0\) otherwise.

Altogether, \[ (BB^{\!T})_{ij} \;=\; \begin{cases} \text{degree of } (v_i)\text{= in-degree + out-degree} & \text{if } i=j,\\[6pt] -\,(\text{# edges between }v_i\text{ and }v_j) & \text{if } i\neq j. \end{cases} \]

2. Show how to use a depth-first search of an undirected graph \(G\) to identify the connected components of \(G\), so that the depth-first forest contains as many trees as \(G\) has connected components. More precisely, show how to modify depth-first search so that it assigns to each vertex \(v\) an integer label \(v.\mathrm{cc}\) between \(1\) and \(k\), where \(k\) is the number of connected components of \(G\), such that \(u.\mathrm{cc} = v.\mathrm{cc}\) iff \(u\) and \(v\) belong to the same connected component.

Algorithm

DFS-CC(G)
  for each vertex v ∈ V[G] do
      v.color ← WHITE   
      v.π = NIL
  cc ← 0                       
  for each vertex v ∈ V[G] do
      if v.color = WHITE then
          cc ← cc + 1          
          DFS-VISIT-CC(G, v, cc)

DFS-VISIT-CC(G, v, cc)
  v.color ← GRAY
  v.cc    ← cc                
  for each (v, u) ∈ E[G] do
      if u.color = WHITE then
           v.π = NIL
          DFS-VISIT-CC(G, u, cc)
  v.color ← BLACK

This modified DFS pass the current component identifier cc down the recursion and assign it to every vertex reached during that call. Each time the outer loop in DFS-CC encounters a previously unseen vertex, increment cc, making that vertex the root of a new DFS tree corresponding to a different connected component.

The running time is same as ordinary DFS: \[ \Theta(|V|+|E|), \]

3. Show that a graph has a unique minimum spanning tree if, for every cut of the graph, there is a unique light edge crossing the cut. Show that the converse is not true by giving a counterexample.

If every cut in a connected weighted graph has exactly one lightest edge, then the minimum-spanning tree is forced to be unique:
assume two different MSTs T₁ and T₂ existed, choose the cheapest edge e that lies in T₁ but not T₂, and observe that deleting e breaks T₁ into two parts forming a cut; because T₂ still spans the graph it must contain some other edge f that reconnects the parts, but e was already the sole lightest edge across that cut, so f cannot be lighter and cannot be equal weight without violating uniqueness. This is contradiction, so only one MST can exist.

A counter-example.

\(w(AC)=w(BC)=1,\;w(AB)=2\).

The only MST is \(\{AC,\,BC\}\) with total weight \(1+1=2\). Any other spanning tree must include the heavier edge \(AB\), giving weight \(\ge 3\). Consider the cut \((\{C\},\{A,B\})\). Both edges \(AC\) and \(BC\) cross it, each with minimum weight \(1\). Hence the cut’s light edge is not unique.

Thus uniqueness of the MST does not force every cut to have a single light edge, proving that the converse implication fails.

4. Given a weighted, directed graph \(G=(V,E)\) with no negative-weight cycles, let \(m\) be the maximum over all vertices \(v\in V\) of the minimum number of edges in a shortest path from the source \(s\) to \(v\) (shortest by total weight, not by number of edges). Suggest a simple change to the Bellman–Ford algorithm that allows it to terminate in \(m+1\) passes, even if \(m\) is not known in advance.

Pseudocode modification

EARLY-STOP-BELLMAN-FORD(G, w, s)
  for each vertex v ∈ V[G]          
      d[v] ← +∞
  d[s] ← 0

  repeat
      relaxed ← false              
      for each edge (u, v) ∈ E[G]   
          if d[u] + w(u,v) < d[v] then
              d[v] ← d[u] + w(u,v)
              relaxed ← true
 until relaxed = false

After the \(k^{\text{th}}\) pass the labels are correct for all vertices whose shortest path uses at most \(k\) edges. Consequently, after \(m\) passes all vertices have their final distances; during pass \(m+1\) the inner loop never upgrades any label, so relaxed remains false. The repeat until therefore exits right after completing pass \(m+1\).

The algorithm performs exactly \(m+1\) full edge scans regardless of the size of \(m\) and returns the same distances as the standard \(|V|-1\)-pass Bellman-Ford procedure.

5. Consider a directed graph \(G=(V,E)\) in which each edge \((u,v)\in E\) has an associated value \(r(u,v)\in[0,1]\) representing the reliability (probability the channel from \(u\) to \(v\) does not fail). Assume independence. Give an efficient algorithm to find the most reliable path between two given vertices.

Turn the product into a sum using logs. Set edge weight \(w(u,v) = -\ln r(u,v)\). Then the most reliable path maximizes \(\prod r\) ⇔ minimizes \(\sum w\). Since \(w \ge 0\),use Dijkstra.

Algorithm

Algorithm LogDijkstra(G=(V,E), r, s, t):
    for each edge (u, v) in E do
        if r(u,v) == 0 then
            w(u,v) ← +∞          
        else
            w(u,v) ← -ln(r(u,v))

    for each v in V do
        dist[v] ← +∞
        parent[v] ← NIL
    dist[s] ← 0

    Q ← priority queue keyed by dist[*]
    insert all vertices into Q

    while Q is not empty do
        u ← Extract-Min(Q)
        if u = t then break
        for each edge (u, v) in outgoing(u) do
            if dist[u] + w(u,v) < dist[v] then
                dist[v] ← dist[u] + w(u,v)
                parent[v] ← u
                Decrease-Key(Q, v)

    P ← PathFromParents(parent, s, t)
    reliability ← exp( - dist[t] )  
    return (P, reliability)

Complexity

With a binary heap, Dijkstra runs in \[ O\big((|V|+|E|)\log |V|\big). \]

6. Show how to express the single-source shortest-paths problem as a product of matrices and a vector. Describe how evaluating this product corresponds to a Bellman–Ford-like algorithm.

Represent the graph’s edge weights in a matrix \(A\) and work in the min-plus semiring,where addition is \(\min\) and multiplication is ordinary \(+\),so that a matrix vector product \[ (d \otimes A)_j = \min_{i}\!\bigl(d_i + w(i,j)\bigr) \] performs a full relaxation pass over all incoming edges to every vertex. Beginning with the distance vector \(d^{(0)}\) ,0 at the source, \(\infty\) elsewhere, and multiplying by \(A\) exactly \(n-1\) times produces vectors \(d^{(k)}\) that encode the best paths of length \(\le k\); after the final pass, \(d^{(n-1)}\) equals the Bellman–Ford result because no simple path in an \(n\)-vertex graph is longer than \(n-1\) edges. Hence the single-source shortest-paths solution is \[ d = d^{(0)} \otimes A^{\otimes (n-1)}, \] showing Bellman–Ford is repeated min-plus matrix vector products.

7.The Floyd–Warshall algorithm appears to require \(\Theta(n^3)\) space, since it creates \(d_{ij}^{(k)}\) for \(i,j,k=1,2,\dots,n\). Show that the procedure \(\text{Floyd–Warshall}'\), which simply drops all the superscripts, is correct, and thus only \(\Theta(n^2)\) space is required.

FLOYD-WARSHALL'(W, n)
1  D ← W                      
2  for k ← 1 to n
3       for i ← 1 to n
4            for j ← 1 to n
5                 D[i,j] ← min( D[i,j],  D[i,k] + D[k,j] )
6  return D

The variant of the Floyd–Warshall algorithm is correct, it stores all distances in one \(n \times n\) matrix \(D\).; Initially \(D = W\), the edge-weight matrix, so \(D_{ij}\) is the weight of the direct edge \(i \rightarrow j\) or \(\infty\) if none exists.
During the outer loop, for each\(k = 1,\dots ,n\) we update every pair \((i,j)\) with D[i,j] = min(D[i,j],D[i,k] + D[k,j]). A simple loop-invariant shows correctness: after finishing the pass for a given \(k\), the entry \(D_{ij}\) equals the weight of the shortest path from \(i\) to \(j\) whose intermediate vertices are drawn only from the set \(\{1,\dots ,k\}\).
The invariant holds initially and is preserved because the assignment considers precisely the best new route that uses vertex \(k\). When \(k = n\) every vertex is allowed, so \(D_{ij}\) is the true all-pairs shortest-path distance.
The three nested loops still perform \(\Theta(n^{3})\) work, but the memory use is \(\Theta(n^{2})\) space instead of the original \(\Theta(n^{3})\).

8. The edge connectivity of an undirected graph is the minimum number \(k\) of edges that must be removed to disconnect the graph. For example, the edge connectivity of a tree is \(1\), and that of a cyclic chain of vertices is \(2\). Show how to determine the edge connectivity of an undirected graph \(G=(V,E)\) by running a maximum-flow algorithm on at most \(|V|\) flow networks, each having \(O(V+E)\) vertices and \(O(E)\) edges.

The edge connectivity of an undirected graph \(G=(V,E)\) is the smallest number \(k\) of edges whose removal disconnects the graph. We can compute \(k\) using at most \(|V|\) runs of a maximum-flow algorithm.

Start with the undirected graph \(G=(V,E)\). For every undirected edge \(\{u,v\}\in E\), add two directed arcs \((u,v)\) and \((v,u)\), each with capacity \(1\). The network has \(O(|V|+|E|)\) vertices ,specifically \(|V|\) vertices and \(2|E|\) arcs.

Algorithm

EdgeConnectivity(G):
    Build the capacitated network N as above
    pick an arbitrary vertex s
    λ ← +∞
    for each vertex t ≠ s:
         run Max-Flow(N, s, t)          
         λ ← min(λ, value returned)
         if λ == 1: break               
    return λ

This run at most \(|V|-1\) max-flow computations. Each is on a network with \(O(|V|+|E|)\) size and \(O(|E|)\) arcs capacities are 1.

9. Suppose pattern \(P\) and text \(T\) are randomly chosen strings of length \(m\) and \(n\), respectively, from the \(d\)-ary alphabet \(\Sigma_d=\{0,1,\dots,d-1\}\) with \(d\ge 2\). Show that the expected number of character-to-character comparisons made by the implicit loop in line 2 of the naive algorithm is \[ (n-m+1)\,\frac{1 - d^{-m}}{1 - d^{-1}} \;\le\; 2(n-m+1) \] over all executions of this loop. (Assume the naive algorithm stops comparing characters for a given shift once it finds a mismatch or matches the entire pattern.) Thus, for randomly chosen strings, the naive algorithm is quite efficient.

Let the probability of matching a single character be \(p=\tfrac1d\).

During one shift the algorithm compares

\[ X = \min\bigl\{\,k\in\{1,\dots,m\}\mid P_k\ne T_{s+k}\bigr\} \;\; \text{or }X=m\text{ if all match}. \]

The expected value is obtained

\[ \begin{aligned} \mathbb{E}[X] &= \sum_{k=1}^{m}\Pr\!\bigl(\text{first }k-1\text{ characters match}\bigr) \\[4pt] &= \sum_{k=1}^{m} p^{\,k-1} \;=\; \frac{1-p^{\,m}}{1-p}. \end{aligned} \]

There are \(n-m+1\) independent shifts, so

\[ \; \mathbb{E}[\text{total}] \;=\; (n-m+1)\, \frac{1-d^{-m}}{1-d^{-1}} \; \] as claimed.

Because \(d\ge 2\), we have \(1-d^{-1}\ge \tfrac12\) and \(0<1-d^{-m}\le 1\). Hence

\[ \frac{1-d^{-m}}{1-d^{-1}} \;\le\; \frac{1}{\tfrac12} \;=\;2, \] so \[ \; \mathbb{E}[\text{total}] \;\le\; 2\,(n-m+1). \; \]

On average the string matching algorithm performs at most two character comparisons per text position when pattern and text are random. Expected time \(O(n-m+1)\), its worst-case \(O(m(n-m+1))\) .

10. Show how to extend the Rabin–Karp method to handle the problem of looking for a given \(m\times m\) pattern in an \(n\times n\) array of characters. (The pattern may be shifted vertically and horizontally, but not rotated.)

Calculate the hashes in each column just like the Rabin-Karp in one-dimension, then treat the hashes in each row as the characters and hashing again.

Horizontal hashes.
For every row \(i\;(0\le i<m)\) compute \[ r_i=\sum_{j=0}^{m-1}P[i,j]\;b^{\,m-1-j}\pmod q. \]
Vertical hash.
Treat \((r_0,r_1,\dots,r_{m-1})\) as characters and hash once more: \[ H_P=\sum_{i=0}^{m-1}r_i\,a^{\,m-1-i}\pmod q, \]

Row

for i = 0..n-1:
    
    RowHash[i][0] ← Σ_{j=0}^{m-1}  T[i,j] · b^{m-1-j}  (mod q)
    powB ← b^{m-1} mod q          
    for j = 1..n-m:
        RowHash[i][j] ← ( RowHash[i][j-1]           – 
                           T[i,j-1] · powB ) · b  + 
                           T[i,j+m-1]              (mod q)

Time: \(O(n^2)\)

Vertical

matches ← ∅
for col = 0 .. n-m:             
    H ← Σ_{i=0}^{m-1} RowHash[i][col] · a^{m-1-i}  (mod q)
    powA ← a^{m-1} mod q
    for row = 0 .. n-m:      
        if H = H_P:
            verify window (row..row+m-1, col..col+m-1)
        if row < n-m:
            H ← ( H                          – 
                  RowHash[row][col] · powA ) · a  +
                  RowHash[row+m][col]         (mod q)

Each update is \(O(1)\).

Complexity

Complexity is \(O(n^2)\)