# CHAPTER 10: BINARY TREES

Linked-lists are examples of linear non-contiguous data structures, where movement takes place in a single dimension (forward and backward). In this chapter, we'll discuss a form of non-linear, non-contiguous data structures known as trees, which add a second up/down dimension. In particular, we'll explain special types of trees known as binary trees, which are useful in search operations.

## WHAT ARE TREES?

There is a directionality to the links of a tree. In the strictest definition of a tree, the links point down the tree, but never up. In addition, a tree cannot contain cycles. That is, no path originating from a node can return to that node. Figure 10.1(a) shows a set of nodes that have the properties of trees. Figure 10.1(b) shows a set of nodes that do not constitute a tree. That's because node d has two parents, and two cycles are present (a-b-d and a-c-d).

#### (b) Not a tree

Every tree has a height, which is defined as the maximum number of nodes that must be visited on any path from the root to a leaf. For instance, the tree in Figure 10.1(a) has a height of 3. In turn, the depth of a node is the number of nodes along the path from the root to that node. For instance, node c in Figure 10.1(a) has a depth of 1.

Trees provide a natural way to represent the hierarchical relationship of data. For example, Figure 10.2 shows a tree representing the hierarchical organization of a company. Figure 10.3 shows an example of a tree representing the expression a*b+c. Here, the operators * and + have an order of precedence, where + binds looser than *.

Note In many of the figures in this chapter--including Figures 10.2 and 10.3--the direction of the arrows is implied. Unless otherwise shown, the direction is always from the root downward.

## Binary Trees

In general, tree nodes can have any number of children. In a binary tree, the nodes have no more than two children. Figures 10.1(a) and 10.3 are examples of binary trees. Figure 10.2 isn't a binary tree because several of its nodes have three children. A node in a binary tree isn't required to have two children. For example, node b in Figure 10.1(a) has only one child. Of course, the leaves of a binary tree have no children.

Binary trees are easy to implement because they have a small, fixed number of child links. Because of this characteristic, binary trees are the most common type of trees and form the basis of many important data structures.

## Binary Tree Node Classes

As was true for the linked-list nodes described in Chapter 8, it's useful to define an abstract binary tree node class that doesn't store data, and then derive, using templates, a family of node classes holding different types of data. The following Bnodeb and Bnode classes use this technique:

`struct Bnodeb  {  // An abstract binary tree node`

`Bnodeb *left;  // Pointer to left child`

`Bnodeb *right; // Pointer to right child`

`};`

`template<class TYPE>`

`struct Bnode : public Bnodeb { // Binary tree node with data`

`TYPE info;`

`Bnode();`

`Bnode(const TYPE &x, Bnode<TYPE> *l=0, Bnode<TYPE> *r=0);`

`Bnode<TYPE> *&Left();`

`Bnode<TYPE> *&Right();`

`} ;`

`template<class TYPE>`

`Bnode<TYPE>::Bnode()`

`{`

`left = 0; right = 0;`

`}`

`template<class TYPE>`

`Bnode<TYPE>::Bnode(const TYPE &x, Bnode<TYPE> *l, Bnode<TYPE> *r)`

`: info(x)`

`{`

`left = l; right = r;`

`}`

`template<class TYPE>`

`Bnode<TYPE> *&Bnode<TYPE>::Left()`

`{`

`return (Bnode<TYPE> *)left;`

`}`

`template<class TYPE>`

`Bnode<TYPE> *&Bnode<TYPE>::Right()`

`{`

`return (Bnode<TYPE> *)right;`

`}`

The Bnodeb and Bnode class definitions are given on disk in the files bnodeb.h and bnode.h.

Many of the binary tree operations are data independent. That's the reason for the Bnodeb class. We can write functions that take Bnodeb pointers as parameters and share these functions across all binary trees, regardless of the type of data actually stored in the nodes. This tames the source code explosion that might otherwise result from the use of templates.

We do pay a price for allowing data independence. When we need to use nodes with data, typecasting problems occur. For instance, the following code fragment will not compile without the typecast shown:

`Bnode<int> *t;`

`...`

`t = (Bnode<int> *)(t->left); // Base to derived typecast`

Presumably, all nodes in the tree t store integer data. However, left is typed as a pointer to Bnodeb object, so we need to typecast the expression t->left to be a Bnode<int> pointer. You've seen a similar problem with the linked list nodes described in Chapter 8. The solution is the same: provide convenient functions that hide the typecasting. That's the purpose of Left( ) and Right( ). We can rewrite our code fragment as:

`Bnode<int> *t;`

`...`

`t = t->Left();`

Note carefully that Left( ) and Right( ) actually return references to pointers. This approach allows us to use these functions on the left side of assignments. For instance:

`Bnode<int> *t, *p;`

`...`

`t->Left() = p;`

Actually, we don't need to use the Left( ) function in this case, due to the one-way assignment compatibility between derived and base types. For instance, we could write

`Bnode<int> *t, *p;`

`...`

`t->left = p; // Implicit derived to base typecast`

In the code that follows, we use the pointers left and right directly whenever we can, since the syntax is simpler. We use Left( ) and Right( ) only when necessary. All this typecasting is a bother, but we can't entirely eliminate it and still allow for the use of data-independent code.

The Bnode<TYPE> class has two constructors that are convenient for building trees. The constructors set the left and right pointers to null, unless you specify otherwise. In the following example, we'll build the tree shown in Figure 10.3 :

`Bnode<char> a('a'), b('b'), c('c');`

`Bnode<char> times('*', &a, &b);`

`Bnode<char> plus('+', &times, &c);`

You'll notice that Bnodeb and Bnode are declared to have all public members. With the linked list nodes described in Chapter 8, we used encapsulation to the fullest, being careful to make all members private when possible. For binary tree nodes, the logistics involved in using fully encapsulated objects get very tedious and messy. For instance, we'll write many standalone functions that operate on Bnodeb pointers--to provide maximum code sharing. While we could make all of these functions friends, and thus make the members of Bnodeb private, it's arguable that little is gained by doing this.

## TREE TRAVERSALS

A tree traversal is a method of visiting every node in the tree. By "visit," we mean that some type of operation is performed. For example, you may wish to print the contents of the nodes. Four basic traversals are commonly used:

Figure 10.4 shows the expression tree presented in Figure 10.3, and the results of traversing the tree using the four types of traversals (where the "visit" function prints node contents). In the context of expression trees, preorder traversal yields what is known as the polish notation for the expression, inorder traversal yields the infix notation (with parenthesis removed, though), and postorder traversal yields the reverse polish notation.

The traversals all have symmetrical cases. For instance, we've defined the traversals to go from left to right. They could alternatively be defined to go right to left, although that is less common in practice.

We'll now explore the first three traversal algorithms, since they can be coded similarly. Level-order traversal will be discussed later.

## Recursive Traversals

#### Figure 10.4 The four tree traversals.

`typedef void (*VisitFunc) (Bnodeb *n);`

`void PreOrder(Bnodeb *t, VisitFunc Visit)`

`{`

`if (t) {`

`Visit(t);`

`PreOrder(t->left, Visit);`

`PreOrder(t->right, Visit);`

`}`

`}`

`void InOrder(Bnodeb *t, VisitFunc Visit)`

`{`

`if (t) {`

`InOrder(t->left, Visit);`

`Visit(t);`

`InOrder(t->right, Visit);`

`}`

`}`

`void PostOrder(Bnodeb *t, VisitFunc Visit)`

`{`

`if (t) {`

`PostOrder(t->left, Visit);`

`PostOrder(t->right, Visit);`

`Visit(t);`

`}`

`}`

`void PrtChNode(Bnodeb *b)`

`{`

`cout << ((Bnode<char> *)b)->info << ' ';`

`}`

`Bnode<char> a('a'), b('b'), c('c');`

`Bnode<char> times('*', &a, &b);`

`Bnode<char> plus('+', &times, &c);`

`PreOrder(&plus, PrtChNode);  cout << '\n';`

`InOrder(&plus, PrtChNode);   cout << '\n';`

`PostOrder(&plus, PrtChNode); cout << '\n';`

## Eliminating Tail Recursion

`void PreOrder(Bnodeb *t, VisitFunc Visit)`

`// Tail recursion eliminated`

`{`

`while(t) {`

`Visit(t);`

`PreOrder(t->left, Visit);`

`t = t->right;`

`}`

`}`

Instead of a call to PreOrder(t->right), t is simply set to t->right and the while loop cycles until a null pointer is found. Note that the recursive call for t->left cannot be eliminated since this call is not tail recursive. Also, no call in PostOrder( ) is tail recursive, so we can't optimize that function any further.

## Flattening out Recursion

`void PreOrderFlat(Bnodeb *t, VisitFunc Visit)`

`{`

`ListStk<Bnodeb *> path;`

`while(1) {`

`if (t) {`

`Visit(t);`

`// Simulate call PreOrderFlat(t->left, Visit);`

`path.Push(t);`

`t = t->left;`

`}`

`else {`

`// Simulate function return`

`if (path.Pop(t) == 0) break;`

`// Optimize the tail recursion`

`t = t->right;`

`}`

`}`

`}`

Here, we've used the ListStk class from Chapter 9, holding pointers to Bnodeb nodes. This stack represents the current path of parents through the tree. Note how the call to PreOrder(t->left) is replaced by a Push( ), followed by the assignment t = t->left. In the original recursive function, the function returns when t is null. This is simulated by doing a Pop( ) from the stack. An InOrderFlat( ) function can be similarly defined. Simply move the Visit( ) call so that it immediately precedes the statement t = t->right.

Figure 10.5 shows the three different movements that take place during a post order traversal:

1. Down to the left

2. Up from the left and down to the right

3. Up from the right and then on up the tree.

`void PostOrderFlat(Bnodeb *t, VisitFunc Visit)`

`{`

`ListStk<Bnodeb *> path;`

`int state = 0;`

`while(1) {`

`if (state == 0) {`

`// Ready to go down to the left`

`if (t) {`

`path.Push(t);`

`t = t->left;`

`}`

`else state = 1;`

`}`

`else { // State 1: Ready to come up from either direction`

`Bnodeb *c = t;`

`if (path.IsEmpty ()) break;`

`t = *path.Top();`

`if (c == t->left && t->right) {`

`// Coming back up the tree from the left, and`

`// since there is a right child, go right.`

`// Note that t is still on top of stack.`

`t = t->right;`

`state = 0;`

`}`

`else {`

`// Coming back up the tree from the right,`

`// or there was no right child, so visit`

`// the node and continue on up. (State`

`// stays at 1.)`

`Visit(t);`

`path.Pop( );`

`}`

`}`

`}`

`}`

#### Figure 10.5 The three movements in a traversal.

This function has some further optimizations in it. Notice in state 1 that we inspect the top of the stack (which holds the parent), rather than popping it. With this approach, we don't have to turn around and push the parent back on the stack, in case we need to go down to the right.

The binary tree traversal functions can be found on disk in the file bnodeb.cpp.

## To Recurse or Not to Recurse

The flat traversal functions are useful when the node processing you need to do doesn't depend on where you are in the traversal. Printing the nodes in sequence is one example of this. However, for some tree operations, you must manage more variables than just the nodes themselves. The states of these variables must also be saved on some type of stack. For these operations, using an explicit stack can become more cumbersome than using straightforward recursion. For example, here is a postorder recursive function to find the height of a tree:

`int Height(Bnodeb *t)`

`{`

`int h = 0;`

`if (t != 0) {`

`int lh = Height(t- >left);`

`int rh = Height(t- >right);`

`h = ((lh > rh) ? lh : rh) + 1;`

`}`

`return h;`

`}`

The height of a tree is the maximum of the height of its two subtrees. In this case, it's easier to write the function recursively. The flat version is awkward and error prone.

## Level-Order Traversal

`void LvlOrder(Bnodeb *t, VisitFunc Visit)`

`}`

`ListQ<Bnodeb *> nodes_left;`

`nodes_left.Insert(t);`

`while(1) {`

`Bnodeb *p;`

`if (nodes_left.Extract(p) == 0) break;`

`Visit(p);`

`if (p->left != 0) nodes_left.Insert(p->left);`

`if (p->right != 0) nodes_left.Insert(p->right);`

`}`

`}`

## TREE WALKING ITERATORS

`enum WalkOrder { pre_order, post_order, in_order, lvl_order };`

`class TreeWalkerb { // Tree-walking base class`

`protected:`

`ListSQ<Bnodeb *> path;`

`Bnodeb *root, *curr;`

`WalkOrder worder;`

`int state;`

`Bnodeb *NextPreOrder();`

`Bnodeb *NextInOrder();`

`Bnodeb *NextPostOrder();`

`Bnodeb *NextLvlOrder();`

`Bnodeb *(TreeWalkerb::* NextFptr)();`

`public:`

`TreeWalkerb(Bnodeb *r, WalkOrder w);`

`virtual ~TreeWalkerb();`

`void Reset();`

`void Reset(Bnodeb *r, WalkOrder w);`

`Bnodeb *Next ();`

`};`

Complete code for the TreeWalkerb class is given on disk in the files twalk.h and twalk.cpp. In addition, a derived template class, TreeWalker, is provided to work with specific Bnode<TYPE> trees.

`TreeWalkerb: :TreeWalkerb(Bnodeb *r, WalkOrder w)`

`{`

`Reset( r, w);`

`}`

`void TreeWalkerb::Reset(Bnodeb *r, WalkOrder w)`

`{`

`root = r;`

`worder = w;`

`if (worder == pre_order) NextFptr = &TreeWalkerb: :NextPreOrder;`

`else if (worder == in_order) NextFptr = &TreeWalkerb: :NextInOrder;`

`else if (worder == post_order) NextFptr = &TreeWalkerb: :NextPostOrder;`

`else NextFptr = &TreeWalkerb: :NextLvlOrder;`

`state = 0; curr = root;`

`path.Clear();`

`if (root && worder == lvl_order) path.Insert(root);`

`}`

For an iterator object to work, it must keep the state of the iteration between calls to Next( ). Since our traversals our recursive, a stack is needed. For postorder traversals, we also need the binary state variable that indicates whether we're going up or down the tree. In addition, we need a queue for level-order traversals. Rather than use both stack and queue objects, we use a combination staque object instead, which we call path.

The Next( ) function is used to return a pointer to the next node in sequence. Next( ) makes a call to the traversal function pointed to by NextFptr. (Notice the pointer-to-member function call, using the ->* operator.)

`Bnodeb *TreeWalkerb: :Next()`

`{`

`return (this->*NextFptr)();`

`}`

`Bnodeb *TreeWalkerb::NextPreOrder()`

`{`

`while(1) {`

`if (curr) {`

`path.Push(curr);`

`Bnodeb *p = curr;`

`curr = curr->left;  // Set up for next visit`

`return p;           // So the node can be "visited"`

`}`

`else  {`

`if (path.Pop(curr) == 0) {`

`return curr = 0;  // Signal no more nodes`

`}`

`else curr = curr->right;`

`}`

`}`

`}`

`void PrintNode(Bnodeb *t)`

`// Assumes node is actually a Bnode<char> node`

`{`

`cout << ((Bnode<char> *)t)->info << '\n';`

`}`

`void PrintTreeOrderings(Bnode<char> *root)`

`{`

`Bnodeb *nxt;`

`if (root == 0) { cout << "Tree is empty\n"; return; }`

`cout << "PreOrder:    ";`

`TreeWalkerb tw(root, pre_order);`

`while((nxt = tw.Next()) != 0) PrintNode(nxt);`

`cout << '\n';`

`cout << "InOrder:     ";`

`tw.Reset(root, in_order);`

`while((nxt = tw.Next()) != 0) PrintNode(nxt);`

`cout << '\n';`

`cout << "PostOrder:   ";`

`tw.Reset(root, post_order);`

`while((nxt = tw.Next()) != 0) PrintNode(nxt);`

`cout << '\n';`

`cout << "Level Order:     ";`

`tw.Reset(root, lvl_order);`

`while((nxt = tw.Next()) != 0) PrintNode(nxt);`

`cout << '\n';`

`}`

`int NumNodes(Bnodeb *t)`

`// Determine number of nodes in the tree t`

`{`

`Bnodeb *nxt;`

`int cnt = 0;`

`TreeWalkerb tw(t, in_order); // Any ordering will do`

`while((nxt = tw.Next()) != 0) cnt++;`

`return cnt;`

`}`

## Printing Sideways

#### Figure 10.6 A tree printed sideways.

`typedef void (*PrtFunc)(Bnodeb *n);`

`void PrintSideWays(Bnodeb *t, PrtFunc PrtNode, int inc, int space)`

`{`

`while(t) {`

`PrintSideWays(t->right, PrtNode, inc, space+inc);`

`for(int i = 0; i<space; i++) cout << ' ';`

`PrtNode(t); cout << '\n';`

`t = t->left;`

`space += inc;`

`}`

`}`

`void PrtChNode(Bnodeb *t) { cout << ((Bnode<char> *)t)->info; }`

`Bnode<char> *t;`

`... // Set up t to be the tree in Figure 10.6`

`PrintSideWays(t, PrtChNode, 3, 0);`

## Printing Upright

The key to easy upright printing is to make two passes. The first pass determines the position of each node in the tree. Once the positions are determined, a second pass uses a level-order traversal of the tree to print the nodes line by line.

To keep things simple, we'll assume that each node in the tree is given its own column and that nodes at the same level (that is, having the same depth) are printed on the same line. Thus, the line coordinate (which we'll call y) is equal to the depth of the node. The column, or x coordinate, can be found by the following observation: If you examine the tree shown in Figure 10.7, you'll notice that the node having an x position of one will be the first node visited in an inorder traversal, the node having an x position of two will be the second node visited, and so on.

`typedef int (*NodeWidthFunc)(Bnodeb *n);`

`BnodeXY *GetNodePosn(Bnodeb *t, int &x, int y, NodeWidthFunc NodeWidth)`

`// Using an inorder traversal, figure out x and y of each node`

`// in the tree t, assuming that the top left corner of the`

`// box bounding t is at (x, y). Builds a parallel shadow tree`

`// of nodes having x and y in them.`

`{`

`if (t == 0) return 0;`

`BnodeXY *l = GetNodePosn(t->left, x, y+1, NodeWidth);`

`BnodeXY *xyt = new BnodeXY;`

`xyt->left = l;`

`xyt->x = x;`

`x += NodeWidth(t);`

`xyt->y = y;`

`y++;`

`xyt->right = GetNodePosn(t->right, x, y, NodeWidth);`

`return xyt;`

`}`

#### Figure 10.7. A row- and column-aligned tree.

`struct BnodeXY : public Bnodeb {`

`int x, y;`

`BnodeXY *Left( )  { return (BnodeXY *) left; }`

`BnodeXY *Right( ) { return (BnodeXY *) right; }`

`} ;`

`void PrintUp(Bnodeb *t, PrtFunc PrtNode, NodeWidthFunc NodeWidth)`

`{`

`int spacing, last_sp, x, oldy;`

`Bnodeb *nxt;`

`BnodeXY *xynxt;`

`if (t == 0) { cout << "Empty tree\n"; return; }`

`x = 1;`

`BnodeXY *xyt = GetNodePosn(t, x, 0, NodeWidth);`

`TreeWalkerb tw(t, lvl_order);`

`TreeWalker<BnodeXY> xytw(xyt, lvl_order);`

`oldy = 0;`

`last_sp = 0;`

`while(1) {`

`xynxt = xytw.Next();`

`if (xynxt == 0) break;`

`if (xynxt->y != oldy) {`

`oldy = xynxt->y;`

`cout << '\n';`

`last_sp = 0;`

`}`

`spacing = xynxt->x - last_sp - 1;`

`if (spacing < 0) spacing = 0;`

`while(spacing-) cout << ' ';`

`nxt = tw.Next();`

`PrtNode(nxt);`

`last_sp = xynxt->x + NodeWidth(nxt) - 1;`

`}`

`cout << "\n\n";`

`// Need to delete shadow tree. Easy to do with postorder traversal`

`xytw.Reset(xyt, post_order);`

`while(1) {`

`xynxt = xytw.Next();`

`if (xynxt == 0) break;`

`delete xynxt;`

`}`

`}`

Note that two level-order iterators are used in PrintUp( ) (the latter uses the TreeWalker template class) to traverse both the main tree and the shadow tree in parallel. You may wonder why we need to store the y position of each node. During a level-order traversal using an iterator, there's no way to tell when we've gone to a new level. Remember that using an iterator loses the shape information of the tree. (In fact, none of the iterative traversals, by themselves, provide enough shape information to reconstruct the tree. See [Lings 86] for more discussion of tree reconstruction.) We detect when a new level has been reached by comparing the y coordinate of the current node with that of the previous node.

Figure 10.8 shows the printed result for the tree shown in Figure 10.7. We used the following code to print the tree:

`void PrtChNode(Bnodeb *t)`

`{`

`cout << ((Bnode<char> *)t)->info;`

`}`

`int ChNodeWidth(Bnodeb *t)`

`{`

`return 1; // This is the printing width`

`}`

`Bnode<char> *t;`

`// Build tree in Figure 10.7`

`...`

`PrintUp(t, PrtChNode, ChNodeWidth);`

Complete code for printing a tree sideways and upright is given in the files treeprt.h, treeprt.mth, and treeprt.cpp.

## BINARY SEARCH TREES

Next, we look at a special class of trees known as binary search trees. A binary search tree has binary nodes and the following additional property: Given a node t, each node to the left is "smaller" than t, and each node to the right is "larger." This definition applies recursively down the left and right subtrees. Figure 10.9 shows a binary search tree where characters are stored in the nodes.

Figure 10.9 also shows what happens if you do an inorder traversal of a binary search tree: you'll get a list of the node contents in sorted order. In fact, that's probably how the name inorder originated.

You may wonder what we mean by "smaller" and "larger" in our definition. This depends on the type of data being stored at the nodes. If, for instance, the data is integers, then the meaning of smaller and larger is obvious. More typically, though, the node data is some kind of structure, with one of the fields (called the key) used to make comparisons. Next, we'll explain how to search a binary tree and how to set up the comparisons for any type of node.

## Binary Tree Searching

#### Figure 10.9 A binary search tree.

`template<class TYPE>`

`Bnode<TYPE> *Search(Bnode<TYPE> *t, const TYPE &x)`

`// Returns pointer to a node matching the data x if`

`// such a node is in the tree t; otherwise, a 0 is returned`

`{`

`while (t) {`

`if (x == t->info) break;`

`t = (x < t->info) ? t->Left() : t->Right();`

`}`

`return t;`

`}`

Figure 10.10 shows the path taken when searching for a node in a sample tree. The Search( ) function assumes that TYPE has the comparison operators '==' and '<' defined. In the case where TYPE is a built--in type--such as integer or character--this is easy because the operators are already defined. But suppose TYPE is a structure that holds a customer name and a unique integer id. We could use either field as the search key. Here's how you might define the customer structure using the name as the key:

`struct Customer {`

`char name[80];`

`int id;`

`Customer(char *n, int i = 0; // Notice default argument`

`};`

`Customer::Customer(char *n, int i)`

`{`

`strncpy(name, n, 80);`

`name[79] = 0; // Ensure null termination`

`Id = i;`

`}`

`int operator==(Customer &a, Customer &b)`

`{`

`return !strcmp(a.name, b.name);`

`}`

`int operator<(Customer &a, Customer &b)`

`{`

`return strcmp(a.name, b.name) < 0;`

`}`

`... // Other operators similarly defined`

`Bnode<Customer> *custdb;`

`...//  Build the tree, then search it`

`if (Search(custdb, "Ford Prefect")) cout << "Customer found\n";`

#### Figure 10.10 Searching a binary tree.

Note that the search string gets converted by the constructor into a Customer object (the Id is given a default value). Obviously, more direct and efficient arrangements are possible. Sometimes, only the key is stored in the nodes, along with a pointer to the actual data. That is also easy to set up. The details are highly application dependent.

## Why Use Binary Search Trees?

The inefficiency is due to the one-dimensionality of a linked list. We would like to have a way to jump into the middle of the list, in order to speed up the search process. In essence, that's what a binary search tree does. It adds a second dimension that allows us to manuever quickly through the nodes. In fact, the longest path we'll ever have to search is equal to the height of the tree. The efficiency of a binary search tree thus depends on the height of the tree. We would like the height to be as small as possible. For a tree holding n nodes, the smallest possible height is logn, so that's how many comparisons are needed on average to search the tree.

Unfortunately, trees can become so unbalanced that they're no better for searching than linked lists. Such trees are called degenerate trees. Figure 10.12 shows an example. For a degenerate tree, an average of n/2 comparisons are needed, with a worst case of n comparisons--the same cs for a linked list.

When nodes are being added and deleted in a binary search tree, it's difficult to maintain the balance of the tree. We'll investigate methods of balancing trees in the next chapter. In this chapter, we'll use simple tree-construction methods. These methods may result in very unbalanced trees, depending on what order we insert nodes in the tree.

## Inserting Nodes into a Binary Search Tree

`template<class TYPE>`

`Bnode<TYPE> *Add(Bnode<TYPE> *&root, const TYPE &x, int &exists)`

`// Returns matching node, new or old. May modify the root pointer.`

`{`

`Bnode<TYPE> *p;`

`int side;`

`// Look for a place to insert the node`

`Bnode<TYPE> *t = SearchP(root, x, p, side);`

`if (t == 0) { // No matching node found`

`t = new Bnode<TYPE>(x); // left and right should be set to 0`

`if (p) {`

`if (side) p->right = t; else p->left = t;`

`}`

`else root = t; // No parent, so we have a new root`

`existed = 0;`

`}`

`else existed = 1;`

`return t;`

`}`

`template<class TYPE>`

`Bnode<TYPE> *`

`SearchP(Bnode<TYPE> *t, const TYPE &x, Bnode<TYPE> *&p, int &side)`

`// Returns matching node in tree t, along with parent. If`

`// matching node is t itself, a 0 is passed back for the parent.`

`// Returns 0 if no match found.`

`{`

`p = 0;`

`while (t) {`

`if (x == t->info) break;`

`p = t;`

`if (x < t->info) {`

`side = 0;`

`t = t->Left();`

`}`

`else {`

`side = 1;`

`t = t->Right();`

`}`

`}`

`return t;`

`}`

The SearchP( ) function keeps track of which side of the parent the matching node occurs on--in the variable called side. This isn't strictly necessary because we could do an extra compare in Add( ) to determine the side. The following code fragment, taken from Add( ) and modified, shows how:

`if (p) {`

`// Used to be: if (side) p->right = t; else p->left = t;`

`if (p->info < x) p->right = t; else p->left = t;`

`}`

If the comparison is an expensive operation, it's best to use the side variable instead. Figure 10.13 shows what happens when we use the Add( ) function to add some nodes to a tree.

## Deleting Nodes from a Binary Search Tree

1. The node is a leaf.

2. The node has only one child.

3. The node has two children.

Case 1 is easy to handle because the node can simply be deleted and the corresponding child pointer of its parent set to null. Figure 10.14 shows an example. Case 2 is almost as easy to manage. Here, the single child can be promoted up the tree to take the place of the deleted node, as shown in Figure 10.15.

Case 3, where the node to be deleted has two children, is more difficult. We must find some node to take the place of the one deleted and still maintain the binary search tree property. There are two obvious candidates: the inorder predecessor and the successor of the node. We can detach one of these nodes from the tree and insert it where the deleted node used to be. A moment's reflection should convince you that using either of these nodes as the replacement will still maintain the binary search tree property.

#### Figure 10.15 Deleting a node that has one child.

The predecessor of a node can be found by going down to the left once, and then all the way to the right as far as possible. To find the successor, an opposite traversal is used: first to the right, and then down to the left as far as possible. Figure 10.16 shows the path taken to find both the predecessor and successor of a node in a sample tree.

Both the predecessor and successor nodes are guaranteed to have no more than one child, and they may have none. (Think about how the traversals take place: they always end when the corresponding left or right child is null.) Because of this, detaching the predecessor or successor reduces to either Case 1 or 2, both of which are easy to handle. In Figure 10.17, we delete a node from the tree and use its successor as the replacement.

`Bnodeb *DetachNode(Bnodeb *&root, Bnodeb *t, Bnodeb *p, int side)`

`// Detaches node t with parent p from the tree. Node t is`

`// the left child if side = 0; otherwise, it's the right child.`

`// If p is 0, then it is the root, and that is handled`

`// accordingly. Redundantly returns the pointer t. May`

`// have to update root pointer.`

`{`

`Bnodeb *psucc, *replacement;`

`if (t) {`

`if (t->left == 0 || t->right == 0) {`

`// At least one child is null, so use the other`

`// as the replacement. (It may be null, too.)`

`replacement = (t->left) ? t->left : t->right;`

`}`

`else { // Neither child is null`

`psucc = ParentOfSuccessor(t); // Guaranteed not null`

`if (psucc == t) { // Immediate successor`

`replacement = psucc->right;`

`}`

`else {`

`// Detach replacement from its current location`

`// and relocate it to where t used to be`

`replacement = psucc->left;`

`psucc->left =  psucc->left->right;`

`replacement->right = t->right;`

`}`

`// Finish relocating replacement to go where t used to be`

`replacement->left = t->left;`

`}`

`if (p) { // Fix parent of t to point to replacement`

`if (side)`

`p->right = replacement; else p->left = replacement;`

`}`

`else root = replacement; // No parent, so t was the root`

`}`

`return t;`

`}`

#### (b) After deleting 'e'

`Bnodeb *ParentOfSuccessor(Bnodeb *t)`

`// If no successor, a 0 is returned`

`{`

`Bnodeb *p = 0;`

`// Go right, then all the way left`

`Bnodeb *q = t->right;`

`if (q) {`

`p = t;`

`while(q->left) {`

`p = q;`

`q = q->left;`

`}`

`}`

`return p;`

`}`

Both DetachNode( ) and ParentOfSuccessor( ) can be written generically with Bnodeb pointers. We don't need to do any key comparisons since the structure of the tree lets us find the nodes we're looking for.

`template<class TYPE>`

`Bnode<TYPE> *Detach(Bnode<TYPE> *&root, const TYPE &x)`

`// Detaches node matching x from the tree.`

`// May have to update the root pointer.`

`{`

`int side;`

`Bnode<TYPE> *p, *t;`

`t = SearchP(root, x, p, side);`

`return (Bnode<TYPE> *)DetachNode((Bnodeb *)root, t, p, side);`

`}`

`template<class TYPE>`

`int Delete(Bnode<TYPE> *&root, const TYPE &x)`

`// Returns 1 if node actually found; otherwise, returns 0`

`// May have to update the root pointer`

`{`

`Bnode<TYPE> *n = Detach(root, x);`

`delete n;`

`return n != 0;`

`}`

## Clearing and Copying Binary Trees

`template<class TYPE>`

`void Clear(Bnode<TYPE> *&root)`

`{`

`DelTree(root);`

`root = 0;`

`}`

`template<class TYPE>`

`void DelTree(Bnode<TYPE> *t)`

`{`

`if (t == 0) return;`

`DelTree(t->Left());`

`DelTree(t->Right());`

`delete t;`

`}`

By using postorder traversal, the nodes are deleted from the bottom up. Note that it would be difficult to use any other type of traversal here because we would have to delete nodes before we're through with them. The DelTree( ) function can alternatively be written non-recursively, using a postorder iterator object:

`template<class TYPE>`

`void Bstree<TYPE>::DelTree(Bnode<TYPE> *t)`

`{`

`TreeWalker< Bnode<TYPE> > tw(t, postorder);`

`Bnode<TYPE> *nxt;`

`while((nxt =  tw.Next()) != 0) delete nxt;`

`}`

`template<class TYPE>`

`int Copy(Bnode<TYPE> *&root, const Bstree<TYPE> &t)`

`{`

`Clear(root);`

`Bnode<TYPE> *r = DupTree(t.root);`

`if (r) {`

`root = r;`

`return 1;   // Success`

`}`

`else return 0; // Failure`

`}`

`template<class TYPE>`

`Bnode<TYPE> *DupTree(Bnode<TYPE> *t)`

`{`

`if (t == 0) return 0;`

`Bnode<TYPE> *l = DupTree(t->Left());`

`Bnode<TYPE> *r = DupTree(t->Right());`

`return new Bnode<TYPE>(t->info, 1, r);`

`}`

`template<class TYPE>`

`int Copy(Bnode<TYPE> *&root, const Bstree<TYPE> &t, WalkOrder w)`

`{`

`Clear(root);`

`TreeWalker< Bnode<TYPE> > tw(t.root, w);`

`Bnode<TYPE> *nxt;`

`int dmy;`

`while((nxt = tw.Next()) != 0) Add(root, nxt->info, dmy);`

`return 1;`

`}`

Figure 10.18 shows the results when we use the four different types of traversals on the source tree during copying. Notice that both preorder and level-order traversals produce a destination tree that has the same structure as the source tree. A postorder traversal yields a different structure, but it's still a valid binary search tree.

As you can see, the order that you insert nodes into a binary search tree does make a difference. In fact, if you insert nodes into a tree in sorted order, as exemplified by the inorder traversal copying in Figure 10.18, the result is a degenerate tree. This fact suggests you should avoid inserting nodes into a binary search tree in sorted order. In the next chapter, we'll look at ways of circumventing this problem.

#### Figure 10.18 Copying via source tree traversals.

You'll notice that operations like Add( ) and Delete( ) are awkward due to the unidirectional capability of the links. We must write the operations in terms of finding the parent of a desired node, rather than find the node directly. This problem is similar to those we had with singly-linked lists in Chapter 8. The clumsiness can be alleviated by adding an extra pointer in each node, which points to the parent of the node. The result is known as a threaded binary tree. Figure 10.19 shows an example. A threaded tree is to a normal tree what a doubly-linked list is to a singly-linked list--it allows easy traversal in both directions. Note that, in a strict sense, a threaded binary tree isn't a true tree because it has cycles.

By using a threaded binary tree, it's possible to write the tree-traversal routines without recursion and without a stack. Since a stack isn't needed, tree operations can be faster, due to reduced overhead. However, before you try to reimplement all the algorithms you've seen in this chapter, keep in mind that the price you pay is space overhead in the tree nodes. (This is the same speed vs. space tradeoff you've seen repeatedly.) You pay for the overhead of parent pointers with every node, regardless of whether you need those pointers. When you use a stack, you pay for the overhead of the stack only when needed, and that overhead grows with the height of the tree, not with the number of nodes in the tree.

#### Figure 10.19 A threaded binary tree.

We're not recommending that you avoid using parent pointers, only that you be sure to consider the tradeoffs. Code without parent pointers is only slightly more complicated than code with them. After the operations are coded, the fact that they may be a little clumsly becomes a moot point. For maximum speed, however, you may want to use parent pointers.

## BINARY TREES AS CONCRETE DATA TYPES

`template<class TYPE>`

`class Bstree { // A binary tree of nodes holding TYPE data`

`protected:`

`Bnode<TYPE> *root;`

`Bnode<TYPE> *SearchP(const TYPE &x, Bnode<TYPE> *&p, int &side);`

`virtual Bnode<TYPE> *AllocNode`

`(const TYPE &x, Bnode<TYPE> *1=0, Bnode<TYPE> *r=0);`

`Bnode<TYPE> *DupTree(Bnode<TYPE> *t);`

`virtual void FreeNode(Bnode<TYPE> *n);`

`void DelTree(Bnode<TYPE> *t);`

`public:`

`Bstree();`

`Bstree(const Bstree<TYPE> &t);`

`virtual ~Bstree();`

`void operator=(const Bstree<TYPE> &t);`

`int Copy(const Bstree<TYPE> &t);`

`int Copy(const Bstree<TYPE> &t, WalkOrder w);`

`void Clear();`

`Bnode<TYPE> *Member(const TYPE &x);`

`const Bnode<TYPE> *Member(const TYPE &x) const;`

`Bnode<TYPE> *Add(const TYPE &x, int &existed);`

`Bnode<TYPE> *Add(const TYPE &x);`

`Bnode<TYPE> *Detach(const TYPE &x);`

`Bnode<TYPE> *DetachMin();`

`Bnode<TYPE> *DetachMax();`

`int Delete(const TYPE &x);`

`void DeleteMin();`

`void DeleteMax();`

`Bnode<TYPE> *Root();`

`Bnode<TYPE> *Min();`

`Bnode<TYPE> *Max();`

`int IsEmpty() const;`

`};`

Complete code for the Bstree class is given on disk in the files bstree.h, bstree.mth, and bstreeb.cpp. A test program is provided in tstbs.cpr. This program uses the tree-walking and tree printing functions given earlier in the chapter.

`template<class TYPE>`

`Bnode<TYPE> *Bstree<TYPE>::Add(const TYPE &x, int &exists)`

`{`

`Bnode<TYPE> *p;`

`int side;`

`Bnode<TYPE> *t = SearchP(x, p, side);`

`if (t == 0) {`

`t = AllocNode(x);`

`if (p) {`

`if (side) p->right = t; else p->left = t;`

`}`

`else root = t;`

`existed = 0;`

`}`

`else existed = 1;`

`return t;`

`}`

## SUMMARY

The Bstree class is set up like most the concrete data type classes in this book--with ample constructors, destructors, overloaded assignment operators, and so on. Whether or not you need all these functions is, of course, up to you. We've added some functions that retrieve and delete the minimum and maximum nodes from the tree. The minimum node can be found by going left as far as possible. The maximum node can be found by going right as far as possible.

Similar to the linked-list classes defined in Chapter 8, we've provided the virtual functions AllocNode( ) and FreeNode( ), which you can override to change how the tree nodes are allocated. For example, a binary tree can be stored contiguously using an array of nodes, as well as on the heap with non-contiguous nodes.

A few hints for building trees with arrays: You'll want to maintain a singly-linked free list of deleted tree nodes. This can be done by reusing the left (or, alternatively, right) child pointers to serve as the free list's next pointers. Also, nothing prevents you from dispensing with pointers altogether and instead using two parallel arrays of indexes that correspond to the left and right child links.

If you don't plan on overriding AllocNode( ) and FreeNode( ), you can declare them as static, along with the recursive functions DupTree( ) and DelTree( ). This will result in more efficient code, since in the recursive function calls, the implicit this pointer won't be passed.