Kruskal's Algorithm: Analysis
- Compare various approaches to checking for cycles and the resulting time/space tradeoffs between them for Kruskal's Algorithm.
Exercise Based on your understanding of Kruskal's algorithm, how can we efficiently implement the step which involves finding the next min-weight edge in $G$?
Solution
- Keep a sorted array of edges. Keep an index to the next position (edge).
- Keep edges in a (min-heap) priority queue.
With an optimal sorting algorithm (to sort edges of the input graph by increasing weight), both approaches are $\Omicron(M \lg M)$ runtime.
We would spend $\Omicron(M \lg M)$ to sort the edges and then get the next edge in $\Omicron(1)$ time. Whereas, we can build the PriorityQueue in $\Omicron(M)$ time and remove the next "best" edge in $\Omicron(\lg M)$. We would have to do the "remove" $\Omicron(M)$ times because some edges may have to be disregarded (they cause cycle).
Exercise Once the next min-weight edge $(v, w)$ is found, how can we efficiently check if adding it to the MST would create a cycle?
Solution
We cannot check for a cycle by simply checking if the endpoints are already in $T$ (why?). We can run BFS/DFS on $T$, start at $v$ and check if $w$ is reachable.
Exercise Based on your answers to the previous questions, analyze the asymptotic complexity of Kruskal's algorithm.
Solution
Operation | Frequency | Cost per operation |
---|---|---|
build PQ | $1$ | $\Omicron(M)$ |
extract min | $\Omicron(M)$ | $\Omicron(\lg M)$ |
run BFS/DFS | $\Omicron(M)$ | $\Omicron(N+M)$ |
From the table, it can be seen that Kruskal's algorithm is quadratic. However, we can improve the performance by using another data structure called Union-Find for efficiently checking/preventing cycles. We will explore Union-Find in the next chapter!