Binary Decision Tree: Analysis

  • Justify $\Omega(\lg n)$ is the lower bound for comparison-based searching algorithms.

Here is the schematic representation of a binary decision tree.

Notice the followings:

  • The decision tree is a binary tree (the answer to the comparison operation is either "yes" or "no").

  • Internal nodes are binary decisions (corresponding to comparisons).

  • External nodes (leaves) are resulting outputs (answers).

  • An execution of the algorithm corresponds to a path from the root to a leaf.

  • The length of such an execution path is the running time of the algorithm for the case of the input that led to that execution.

  • The height of the tree is the worst case running time of the algorithm (i.e. the longest execution path).

Since the decision tree is a binary tree, its height is at least $\lg N$ where $N$ is the number of nodes (which is not necessarily the same as e.g. the length of the input sequence). Therefore, the lower bound on any comparison-based algorithm modeled by this decision tree is $\Omega(\lg N)$.

But what is $N$?

For searching, $N$ is $\Omicron(n)$ where $n$ is the number of elements in the search space (size of the collection).

Why?

For lower bound analysis, we consider what is the largest number of nodes we can pack in a binary tree of the minimum height. This is the case in a perfect binary tree. Recall: in such a binary tree with $m$ nodes, there are $\left \lceil m/2 \right \rceil$ leaves and $\left \lfloor m/2 \right \rfloor$ non-leaf nodes (internal nodes and a root).

In the decision tree corresponding to the comparison-based searching algorithm, the number of leaves is greater or equal to the number of possible answers which is greater or equal to $n$, the number of elements.

To check this claim, see the previous exercise, the number of possible outcomes is $4$, one per $a_i$ plus null.

So, if there are at least $n$ leaves, there are at least $2n$ nodes. The height of the tree, therefore, is at least $\lg (2n)$.

Therefore, the lower bound on any comparison-based searching algorithm is $\Omega(\lg n)$ where $n$ is the size of the search space. In other words, binary search is optimal!