9.4.5 A Balancing Act

The "simple" algorithms for growing binary search trees and inserting and deleting records within them are relatively simple and have excellent time and storage requirements?/FONT>on the average?/FONT>when randomly done. These randomly grown trees can be fine for compiler symbol tables or for small data base applications such as dictionaries, but there are situations where good average behavior is not enough. Suppose your rocket ship's computer must search a binary search tree for retro-rocket firing instructions and must complete the search within a specific interval. This is not the time to find that the tree is out of shape and that a much longer than average time will ensue.

One answer is the AVL tree, which is discussed in this section. An AVL tree is a balanced binary tree that is also a binary search tree. (Balanced binary trees were defined in Chapter 7 in the discussion of Fibonacci trees.)

AVL trees have excellent worst-case time and storage requirements and allow access to the ith record to be performed quickly. The algorithms for the growth of such trees and insertion and deletion within them are considerably more complex than for "simple" binary search trees. Balanced trees are very useful whenever all four operations?/FONT>searching, inserting, deleting, and accessing the ith record?/FONT>are needed and worst-case rather than average time is important.

Because we want to use AVL trees to store records, it is desirable for an AVL tree to have as small a depth as possible and as many nodes (in which to store records) as possible for that depth. The "best" AVL tree of depth d is thus complete and has 2d - 1 nodes. A balanced binary search tree (AVL tree) of depth d, which has the least number of nodes among all AVL trees of depth d, represents the "worst" AVL tree of depth d. Such a tree will be a Fibonacci tree that is also a binary search tree. These trees represent the worst-case AVL trees. A Fibonacci AVL tree of depth d will store the minimum possible number of records.

Suppose nd represents the number of nodes of such a tree. Then any AVL tree with depth d must have a number of nodes n that is at least nd. It is possible to prove that nd is related to the (d + 2)th Fibonacci number, Fd+2. In fact, nd = Fd+2 - 1. A great deal is known about the Fibonacci numbers. In particular, , where . It follows from this that . Also, from Chapter 7, n ?/FONT> 2d - 1. After some manipulations we may conclude that

d < 1.44 1g (n + 2) - 0.328

This means that the depth of any AVL tree with n nodes is never greater than 1.44 1g (n + 2). In other words, an AVL tree storing n records has a worst-case depth O(lg n). Since the best possible depth of a binary tree storing n records is ?/FONT>lg (n + 1)?/FONT>, the worst depth of an AVL tree storing n records is at most 44 percent more than the best achievable. Contrast this with the "simple" binary search tree for which the average depth is at least 39 percent more than the best achievable; it is clearly very desirable to be able to create and use AVL trees.