Shortest Path: Unweighted Graph
- Formally describe the shortest path problem.
- Recognize that (unweighted) shortest path algorithm is a modified BFS.
Let us formally define the shortest path problem:
Shortest Path Problem
Input: Graph $G=(V, E)$, and a starting vertex $s \in V$.
Goal: Identify a shortest path from $s$ to every vertex in $G$.
Note we are considering the length (i.e., the number of edges) of a path here. In other words, "shortest path" means "shortest length." (We will soon contrast this with the case where shortest path means shortest cost.)
Lemma: Let $G$ be a directed or undirected graph. At the conclusion of the modified BFS, for every vertex $v \in V$, the value distance[v]
equals the length $\text{dist}(s, v)$ of a shortest path from $s$ to $v$ in $G$ (or $+ \infty$ if no such path exists).
This is not the case for DFS!
Why does BFS find the shortest path?
Okay! Let's think about this problem for a moment. We have a source vertex $s$ and a target vertex $t$. We want a shortest path from $s$ to $t$ (if exists). What can we do? Well, we can look at all the vertices that are one edge away from $s$; if we find $t$, we are done. If not, we can look at all vertices that are two edges away from $s$, then those that are three edges away from $s$, and so on until we find $t$. This strategy is what BFS is doing! It explores all (reachable) vertices that are $k$ edges away from the source vertex $s$ before visiting any vertex $k+1$ edges away. So, if there are multiple paths to $t$, and one returned by BFS has $d$ edges, that path is the shortest. If there was a shorter path with, e.g., length of $d-1$, then $t$ would have been explored before exploring vertices that are $d$ edges away from the source.
For a formal proof, refer to CLRS, Third Edition, Lemma 22.2 on page 598.
Resources
- Wikipedia's entry on Shortest Path Problem.