In this chapter, we'll explore some of the techniques you can use to keep trees balanced. In particular, we'll look at *2-3-4 trees* and *red-black trees*.

Very unbalanced trees are referred to as *degenerate trees*, so called because their searching performance degenerates to that of linked lists. Figure 11.1 shows two examples of degenerate trees. The first tree was built by inserting the keys *a* through *i* in sorted order into a binary search tree. The second tree was built using the insertion sequence *a-i-b-h-c-g-d-f-e*. This pathological sequence is often used to test the balancing capability of a tree. Figure 11.2 shows what the tree in Figure 11.1(b) would look like if it were balanced.

At this point, we need to define two terms: *internal node* and *external node*. An internal node is a tree node having at least one key, and possibly some children. So far, all we've shown are internal nodes. It's sometimes convenient to have another type of node, called an external node, and pretend that all null child links point to such a node. An external node doesn't exist, but serves as a conceptual placeholder for nodes to be inserted.

In this chapter, we'll draw internal nodes using circles, with letters as labels. External nodes are denoted by drawing links leading nowhere, and perhaps labelled using numbers, or they may not be drawn at all. Figure 11.3 shows a sample tree illustrating both internal and external nodes.

Sometimes we'll draw only a portion of a tree, with the top node of this portion having an implied parent (or no parent at all if it's the root). Subtrees are indicated using the same notation as external nodes, by drawing links leading nowhere and having numbers for labels. In fact, you can consider an external node to be an empty subtree.

There are two basic ways used to keep trees balanced:

2. Allow nodes to have more than two children.

Certain types of tree-restructurings, known as *rotations*, can aid in balancing trees. Figure 11.4 illustrates the two types of *single rotations*.

The following sample code for single rotations uses generic *Bnodeb* pointers (as defined in Chapter 10), and assumes that all relevant pointers aren't null:

Bnodeb *RotateRight(Bnodeb *p, Bnodeb *t)

// Rotate right around p, assuming t is p's left child

// Returns t, which will be the new child of p's parent

{

p->left = t->right;

t->right = p;

return t;

}

Bnodeb *RotateLeft(Bnodeb *p, Bnodeb *t)

// Symmetrical case

{

p->right = t->left;

t->left = p;

return t;

}

Another type of rotation is known as a *double rotation*. Figure 11.5 shows the two symmetrical cases. Here's the sample code:

Bnodeb *RotateLeftRight(Bnodeb *g, Bnodeb *p, Bnodeb *t)

// Rotate first around p, then around g. Assumes p is left

// child of g, and t is the right child of p. Returns new

// child of g's parent.

{

p->right = t->left;

t->left = p;

g->left = t->right;

t->right = g;

return t;

}

Bnodeb *RotateRightLeft(Bnodeb *g, Bnodeb *p, Bnodeb *t)

// Symmetrical case

{

p->left = t->right;

t->right = p;

g->right = t->left;

t->left = g;

return t;

}

Rotations don't guarantee a balanced tree, however. For example, Figure 11.6 shows a right rotation that makes a tree more unbalanced. The trick lies in determining when to do rotations and what kind of rotations to do. The red-black trees discussed later have certain rules used to determine when and how to rotate.

We can also help keep trees balanced by allowing nodes to have more than two children. Instead of using binary trees, we can have multi-way trees. A multiway tree node with *n* children is known as an *n*-node. An *n*-node has *n-1* keys that are used for comparisons. For example, Figure 11.7 shows a 4-node with three keys. Nodes with keys less than *a* are inserted down subtree 1. Nodes greater than *a* but less than *b* are inserted down subtree 2. Nodes greater than *b* but less than *c* are inserted down subtree 3. And finally, nodes with keys greater than *c* are inserted down subtree 4.

One special type of multi-way tree is known as a *B-tree*, which we'll study in depth in Chapter 14. A B-tree of order *m* has nodes that can have up to *m* children. In this chapter, we'll look at B-trees of order four. Such trees are often called *2-3-4 trees*. A 2-3-4 tree has three types of nodes: 2-nodes (that is, binary nodes), 3-nodes, and 4-nodes. Figure 11.8 shows a sample 2-3-4 tree.

Insertion into a 2-3-4 tree works as follows: First, search the tree looking for the node to insert the new key into. If the candidate is a 2-node, then insert the key, making the node a 3-node. If the candidate is a 3-node, then insert the key, making the node a 4-node. If the candidate is a 4-node, it is full and must be split. The node is split into two 2-nodes, with the left key going to the left 2-node, the middle key moving up to the parent node, and the right key going to the right 2-node. The key to be inserted is then added to the appropriate 2-node.

Figure 11.9 shows how to split a 4-node, where there is no parent. The middle key is pushed up to a new root. If there is a parent, the parent itself may have to be split. The splitting process then propagates up the tree. Thus, insertion consists of a top-down searching pass, followed by a bottom-up splitting pass. Figure 11.10 shows an example of building a 2-3-4 tree using the same pathological sequence *a-i-b-h-c-g-d-f-e* that we've used before.

One consequence of 4-node splitting is that the tree grows upward, rather than downward as in a binary tree. As a result, the external nodes of a 2-3-4 tree are always on the same level, and the tree is quite balanced. (These same properties hold in general for all B-trees.)

A *red-black tree* is a binary representation of a 2-3-4 tree. (Red-black trees can be used to represent a wide variety of trees. See [GuibSedg 78].) In a red-black tree, the links are given the colors red and black. These colors are used in mapping the 2-node, 3-node, and 4-nodes of a 2-3-4 tree into binary nodes. Figure 11.11 shows the mappings. In this figure and others to follow, red links are shown with thick lines, whereas black links are shown with lines of normal thickness. A node having a black parent link is referred to as a *black node*. A node with a red parent link is a *red node*. In the diagrams, we chose to color the links using thin/thick lines (rather than actually coloring the nodes).

The red-black mappings work as follows: a 2-node is represented using a single binary node with two black children. A 3-node is represented as a pair of binary nodes--one red, one black. Hence there is a red link between the nodes. Note that 3-nodes can be represented with two different orientations. The orientation used depends on the balancing required, as you'll see later. A 4-node is represented as a set of three binary nodes, with the parent node black and the child nodes red. Hence, two red links are used.

Three basic properties emerge from the way the red-black mappings are defined:

Figure 11.12 shows the red-black tree equivalent of the tree given in Figure 11.10. Note that the red-black tree is structurally the same as the binary tree given earlier in Figure 11.2. The only difference is the coloring of the links. The link colors serve as guideposts in determining what kind of rotations to do when balancing the tree. Note that the external nodes all have a black-depth of 2. This corresponds to the height of the equivalent 2-3-4 tree.

The tree shown in Figure 11.12 is fairly balanced, but isn't as balanced as its 2-3-4 tree counterpart, and is twice as high. Why the difference? The 3-nodes and 4-nodes of the 2-3-4 tree must be simulated using multiple binary nodes, each set consuming two levels of the tree. However, it can be shown that no path in a red-black tree is longer than twice as long as the shortest path in the tree. Thus, although the red-black trees aren't perfectly balanced, their worst case isn't all that bad.

Figure 11.13 illustrates an illegal red-black tree, with two consecutive red links in a row (*d*-*h *and *h-i*). Also, the external nodes have different black-depths as indicated in the figure. The resulting tree is still a valid binary search tree, but because it doesn't have the strict red-black tree properties, updates to the tree (assuming they could even be done) would tend to produce an unbalanced tree.

We'll represent red-black tree nodes with the following *RBnodeb* and *RBnode* classes:

#define RED 0

#define BLACK 1

struct RBnodeb : Bnodeb {

char color;

RBnodeb *&Left( );

RBnodeb *&Right( );

};

template<class TYPE>

struct RBnode : Rbnodeb {

TYPE info;

RBnode( );

RBnode(const TYPE &x, char c, RBnode<TYPE> *1=0, RBnode<TYPE> *r=0);

RBnode<TYPE> *&Left( );

RBnode<TYPE> *&Right( );

};

As with all designs, it's worthwhile to strive for data independence whenever possible. That's the purpose of the *RBnodeb* class. This class is derived from the *Bnodeb* class of Chapter 10, and adds a *color *field, but no other data. The color field represents the color of the link coming into the node--0 for red, 1 for black. We've represented color as a character, which represents additional overhead in every red-black node. Actually, only one bit is required. In your own classes, you may be able to arrange this bit to be stored in some unused location in the node's data field. An alternative representation would have colors associated with the child links, rather than the nodes themselves. However, this requires two bits per node, rather than one.

Since *RBnodeb* is derived from *Bnodeb*, it can be used in many generic binary tree functions--such as the tree-traversal functions or the tree-walking iterators. The *RBnode* class is a template derived from *RBnodeb* that adds data to the node. Unfortunately, an *RBnode* object cannot be substituted for a *Bnode* object (a binary node with data), since the *info* field of these two types of nodes is at a different offset in the node structure. As usual, convenience functions *Left( )* and *Right( )* are supplied to help alleviate type casting problems. The details are given on disk.

The *RBtree* class implements red-black trees as concrete data types, with the appropriate constructors, destructors, assignment functions, and other supporting functions. This class is very similar in design to the *Bstree* class developed in Chapter 10. We'll look only at the main operations of searching, inserting, and deleting. The remaining details are given on disk.

template<class TYPE>

class RBtree {

protected:

RBnode<TYPE> *root;

// Other protected functions--such as AllocNode( ), DupTree( ),

// FreeNode( ), and DelTree( )...

public:

RBnode<TYPE> *Member(const TYPE &x);

RBnode<TYPE> *Add(const TYPE &x, int &existed);

RBnode<TYPE> *Detach(const TYPE &x);

int Delete(const TYPE &x);

// Constructor functions, plus other functions for copying,

// clearing, and deleting trees, and functions for handling

// min/max elements, and so forth...

};

Complete code for red-black trees is given on disk in the files *RBnode.h, RBtree.h,* *RBtreeb.cpp,* and *RBtree.mth*. A test program is given in the files *tstrb.cpp* and *tsttree.cpp*. Many of the files from Chapter 10 are used as support files.

A red-black tree can be searched in the same way as an ordinary binary search tree, as the following *Member( )* function shows. Note that the colors of the nodes are not used in the search. The fact that the tree is balanced in no way impinges on the searching code (other than to make the searches faster due to shorter trees, of course).

template<class TYPE>

RBnode<TYPE> *RBtree<TYPE>::Member(const TYPE &x)

// Returns pointer to node containing data x if

// such a node is in the tree t; otherwise, a 0 is returned

// ASSUMES TYPE has '<' and '==' defined.

{

RBnode<TYPE> *t = root;

while (t) {

if (x == t->info) break;

t = (x < t->info) ? t->Left( ) : t->Right( );

}

return t;

}

We described an insertion process earlier for 2-3-4 trees that involved two passes: a top-down pass to determine where the new key should go, and a bottom-up pass to handle any full 4-nodes that need to be split. It's possible to do the insertion in a single top-down pass, and that's the method we'll use for red-black trees.

Single-pass top-down insertions can be done by ensuring that the new key won't be added to a full leaf node. We can do this by splitting any 4-node we see on the path to the leaf into two 2-nodes. Splitting a 4-node, as represented with red-black nodes, is particularly easy. As Figure 11.14 illustrates, the process involves three color flips, and that's all. Note how the two child links are turned to black, but the parent link is turned to red. This is done to preserve the black-depths of the external nodes 1, 2, 3, and 4 (which may be subtrees). In all the red-black tree transformations to follow, we always ensure that the relative black-depths of the external nodes or subtrees are preserved. That way, we can guarantee that the new tree will still be a valid red-black tree. A side effect is that the tree will be kept as balanced as possible.

Since the top node of the split 4-node becomes red during a split, a problem occurs if its parent is also red, because we then end up with two reds in a row. This must be fixed by using appropriate rotations. We'll look at all the possible cases now.

Figure 11.15 shows the two orientations of a 4-node connected to a 2-node. In both orientations, we end up with a 3-node having two 2-nodes as children when the 4-node is split. The results are still legal red-black trees, so nothing else has to be done.

Figure 11.16 illustrates the three cases that occur when a 4-node is connected to a 3-node. Here, the 4-node can be the left child, middle child, or right child of the 3-node. Recall that, in a red-black representation, a 3-node has two orientations. Thus, there are six cases to consider in the red-black tree representations. In Figure 11.17, we show three of these cases. Figures 11.17(a) and 11.17(b) show the two cases that arise when the 4-node is the right child of the 3-node, with the two possible 3-node orientations. Not shown are the two symmetrical cases that occur when the 4-node is the left child. Figure 11.17(c) shows one of the cases that occurs when the 4-node is the middle child. Again, there is a symmetrical case not shown.

In Figure 11.17(a), splitting the 4-node leads to a valid red-black tree, so nothing more needs to be done other than the color flips. However, in Figures 11.17(b) and 11.17(c), splitting the 4-node leads to trees having two reds in a row. Thus, after the color flips, rotations must be performed to restore the redblack property. For the case in Figure 11.17(b), a single rotation is needed. For the case in Figure 11.17(c), a double rotation is required. Note that the colors "follow" the rotations. In all three figures, you should verify for yourself that the black-depths of the external nodes are preserved.

The insertion code is given in the routines *Add( )* and *InsBalance( )*. The *Add( )* function walks down the tree looking for a place to insert the new key, splitting any 4-nodes along the way. After a split, there may be two red nodes in a row. If so, a call is made to *InsBalance( )*, which performs the proper rotations. *Add( )* keeps track of the current node *t*, its parent *p*, grandparent *g*, and great-grandparent *gg*, all of which are needed to do the rotations. To make the call to *InsBalance( )* efficient, the pointers are packaged up into the structure *PtrPak, *and a single parameter is then passed for these pointers. A pointer to the root node is also passed, since the root may change due to a rotation.

struct PtrPack {

RBnodeb *t, *p, *g, *gg, *s, *rs, *ls, *m, *pm;

} ;

template<class TYPE>

RBnode<TYPE> *RBtree<TYPE>::Add(const TYPE &x, int &exists)

// Add the key x to the tree if it doesn't already exist.

// Returns a pointer to the node containing the key, and

// sets the exists flag accordingly.

{

PtrPack pp;

int side;

exists = 1;

pp.t = root; pp.p = 0; pp.g = 0; pp.gg = 0;

while(pp.t && x ! = ((RBnode<TYPE> *)(pp.t))->info) {

// If on a 4-node, we must split it into two 2-nodes

if ((pp.t->left && pp.t->Left( )->color == RED) &&

(pp.t->right && pp.t->Right( )->color == RED)) {

pp.t->color = RED;

pp.t->Left( )->color = BLACK;

pp.t->Right( )->color = BLACK;

// Check for two reds in a row, and adjust if so

if (pp.p && pp.p->color == RED)

root = (RBnode<TYPE> *)InsBalance(root, pp);

// else Figure 11.15 or Figure 11.17(a)

}

pp.gg = pp.g; pp.g = pp.p; pp.p = pp.t;

if (x < ((RBnode<TYPE> *)(pp.t))->info) {

pp.t = pp.t->Left( ); side = 0;

}

else {

pp.t = pp.t->Right( ); side = 1;

}

}

// Reached the bottom, with no matching node

if (pp.t == 0) {

exists = 0;

pp.t = AllocNode(x, RED);

if (pp.t == 0) return 0; // Couldn't add

if (pp.p) {

if (side) pp.p->right = pp.t; else pp.p->left = pp.t;

}

else root = (RBnode<TYPE> *)(pp.t);

// Check for two reds in a row, and adjust if so

if (pp.p && pp.p->color == RED)

root = (RBnode<TYPE> *)InsBalance(root, pp);

}

root->color = BLACK; // Root always made black

return (RBnode<TYPE> *)(pp.t);

}

// Define some macros for convenience

#define T (pp.t)

#define P (pp.p)

#define G (pp.g)

#define GG (pp.gg)

#define S (pp.s)

#define RS (pp.rs)

#define LS (pp.ls)

#define M (pp.m)

#define PM (pp.pm)

RBnodeb *InsBalance(RBnodeb *root, PtrPack &pp)

// Balance adjusting for top-down insertions. Eliminates

// both p and t from being red by doing rotations and

// color changes. g, p, t are assumed not null coming in.

// gg may be null. At the end of this routine, only t

// and p will be valid with respect to each other. g and gg will

// not reflect the proper ordering.

RBnodeb *cofgg; // New child of great-grandparent

int side;

side = GG && GG->right == G;

if (G->left == P) {

if (P->right == T) { // Do double rotate. (Figure 11.17(c))

G->left = T->right;

T->right = G;

P->right = T->left;

T->left = P;

P = GG;

cofgg = T;

}

else { // Do single rotate right. (Figure 11.17(b) reversed)

G->left = P->right;

P->right = G;

cofgg = P;

}

}

else { // G->right == p

if (P->left == T) { // Do double rotate. (Figure 11.17(c) reversed)

G->right = T->left;

T->left = G;

P->left = T->right;

T->right = P;

P = GG:

cofgg = T;

}

else { // Do single rotate left. (Figure 11.17(b))

G->right = P->left;

P->left = G:

cofgg = P;

}

}

cofgg->color = BLACK;

G->color = RED;

if (GG) {

if (side) GG->right = cofgg; else GG->left = cofgg;

}

else root = cofgg;

return root;

}

Figure 11.18 shows an example of building a red-black tree using the pathological insertion sequence *a-i-b-h-c-g-d-f-e*. The resulting tree is equivalent to the 2-3-4 tree in Figure 11.10, except that the 4-node root has been split into two 2-nodes. Note that the tree is quite balanced.

Recall from Chapter 10 how deletions work in a regular binary search tree: The basic idea is to swap the key of the node to be deleted with the key of the node's successor. (We used pointer surgery rather than actually swapping keys. Also, the predecessor could be used instead of the successor.) Then, the successor is deleted, which reduces to the case of removing a node with at least one null child. As we showed in Chapter 10, this case is easy to handle.

1. Merging two 2-nodes into a single 4-node.

2. Flipping the red-black orientations of 3-nodes.

Figure 11.19 shows examples of the first and third cases. The second case is not shown since it doesn't structurally change the equivalent 2-3-4 tree; it just changes the orientation of the red-black interpretation. Please note that these cases have other orientations not shown in the figure. It's interesting that the act of merging 2-nodes into 4-nodes is exactly the opposite of what we do during insertions, where 4-nodes are split in two. Thus, deletions are the opposite of insertions, with the added step of pushing an extra red down the tree.

Figures 11.20, 11.21, and 11.22 show the orientations and transformations for the three balancing operations, in terms of red-black trees. In all three figures, the node labelled *t* is the current node being traversed, *p* is its parent,* g* is its grandparent, *s* is its sibling, and *ls* and *rs* are the left and right children of *s*. Vertical lines shown for grandparents mean that the parent node can either be on the left or right. Dashed lines indicate links that can be either red or black. You should verify for yourself that the transformations shown preserve the relative black-depths of all external nodes.

The algorithm for deleting a node from a red-black tree is split into four procedures: *Delete( )*, *Detach( )*, *DelBalance( )*, and *DoReplacement( )*. The *Delete( )* function (not given) calls *Detach( )* to detach the node from the tree, and then frees the node. *Detach( )* works by walking the tree from the top down, searching for the node to delete, recording its location, and then continuing to walk the tree pretending that the node wasn't found. By doing this, when a null node is reached, the current parent will be the successor node. All the while, rotations and color flips are done by calling *DelBalance( )*. At the end, the *DoReplacement( )* function is called to "swap" the key of the node to delete with the key of the successor. (Actually, the replacement is done by pointer surgery, rather than actually swapping the keys.)

template<class TYPE>

RBnode<TYPE> *RBtree<TYPE>::Detach(const TYPE &x)

// Detaches (but does not free) the node having key x from the tree

{

struct PtrPack pp;

if (root == 0) return 0;

pp.t = root; pp.p = 0; pp.g = 0; pp.m = 0; pp.pm = 0;

while (pp.t) {

// Go down the tree one level, searching for node to delete

if ( ((RBnode<TYPE> *)(pp.t))->info == x ) {

pp.m = pp.t; // Record matching node

pp.pm = pp.p; // And its parent

}

// Update ancestor pointers

pp.g = pp.p; pp.p = pp.t;

if (x < ((RBnode<TYPE> *)(pp.t))->info) {

pp.t = pp.p->Left( ); pp.s = pp.p->Right( );

}

else {

pp.t = pp.p->Right( ); pp.s = pp.p->Left( );

}

if (pp.s) {

pp.ls = pp.s->Left( ); pp.rs = pp.s->Right( );

}

else {

pp.ls = 0; pp.rs = 0;

}

root = (RBnode<TYPE> *)DelBalance(root, pp);

} // end of while loop

root = (RBnode<TYPE> *)DoReplacement(root, pp);

if (root) root->color = BLACK;

return (RBnode<TYPE> *)pp.m; // Node to delete

}

RBnodeb *DelBalance(RBnodeb *root, PtrPack &pp)

// Balancing code during top-down deletion

{

if ((T == 0 | | T->color == BLACK) && S && S->color == RED) {

// Case 2: t black, s red. t might be null,

// but s and p must not be. Flip the orientation

// of the 3-node that s is a part of.

// Fix g child to be s. Also s may become p of m

if (G) {

if (G->left == P) G->left = S; else G->right - S;

}

else root = S;

if (P == M) PM = S;

// Finish rotating

if (P->left == T) { // Figure 11.21(a)

// RotateLeftt(p);

P->right = LS;

S->left = P;

G = S;

S = LS;

}

else { // Figure 11.21(b)

// RotateRight(p);

P->left = RS;

S->right = P;

G = S;

S = RS;

}

// Fix children of s

if (S) { LS = S->Left( ); RS = S->Right( ); }

// Fixup colors

P->color = RED; G->color = BLACK;

}

if (T && T->color == BLACK &&

(T->left == 0 || T->Left( )->color == BLACK) &&

(T->right == 0 || T->Right( )->color == BLACK)) {

if (S && S->color == BLACK &&

(S->left == 0 || S->Left( )->color == BLACK) &&

(S->right == 0 || S->Right( )->color == BLACK)) {

// Case 1: t and s are black 2-nodes. Merge them

// into a 4-node. (Figure 11.20.)

P->color = BLACK; T->color = RED; S->color = RED;

}

else if (LS && LS->color == RED) {

// Case 3: t is a black 2-node, s is 3- or 4-node,

// LS is red (RS might be). Rotate 3- or 4-node to

// the side that t is on.

if (P- >left == T) { // Figure 11.22(b)

// Fix colors

LS->color = P->color; P->color = BLACK; T->color = RED;

// Fix g child to be LS. Also LS may become p of m.

if (G) {

if (G->left == P) G->left = LS; else G->right = LS;

}

else root = LS;

if (P == M) PM = LS;

// Finish: DoubleRotateLeft(s, p)

P->right = LS->left;

LS->left = P;

S->left = LS->right;

LS->right = S;

G = LS;

// We won't fix s and children since they get re-assigned

// at the top of next loop

}

else { // Figure 11.22(b)

// Fix colors

S->color = P->color; LS->color = BLACK;

P->color = BLACK; T->color = RED;

// Fix g child to be s. Also s may become p of m.

if (G) {

if (G->left == P) G->left = S; else G->right = S;

}

else root = S;

if (P == M) PM = S;

// Finish: RotateRight(p);

P->left = RS;

S->right = P;

G = S;

// We won't fix s and children since they get re-assigned

// at the top of next loop

}

}

else if (RS && RS->color == RED) {

// Case 3: t is a 2-node, s is a 3-node, LS black, RS red

// Rotate 3-node to the side of t

if (P->left == T) { // Figure 11.22(b)

// Fix colors

RS->color = BLACK; S->color = P->color;

P->color = BLACK; T->color = RED;

// Fix g child to be s. Also, s may become p of m.

if (G) {

if (G->left == P) G->left = S; else G->right = S;

}

else root = S;

if (P == M) PM = S;

// Finish: RotateLeft(p);

P->right = LS;

S->left = P;

G = S;

// We won't fix s and children, since they get reassigned

// at the top of next loop

}

else { // Figure 11.22(d)

// Fix colors

RS->color = P->color; P->color = BLACK; T->color = RED;

// Fix g child to become RS. Also, RS may become p of m.

if (G) {

if (G->left == P) G->left = RS; else G->right = RS;

}

else root = RS;

if (P == M) PM = RS;

// Finish: DoubleRotateRight(s, p);

P->left = RS->right;

RS->right = P;

S->right = RS->left;

RS->left = S;

G = RS;

// We won't fix s and children, since they get reassigned

// at the top of next loop

}

}

}

return root;

}

RBnodeb *DoReplacement(RBnodeb *root, PtrPack &pp)

{

// At this point, m is the node to delete, pm is its parent,

// p is the replacement node, g is its parent. If m has no

// successor, p will = m, and replacement is the non-null

// child of m.

if (M) { // Matching node was found

if (P == M || M->left == O || M->right == O) {

// No successor, and/or at least one child null

// Get non-null child, if any; else p will be null

P = (M->left) ? M->Left( ) : M->Right( );

}

else { // m has a successor to use as a replacement

if (G != M) {

// Successor isn't immediate child of m, so detach

// from where it is, attach to where m is

G->left = P->right;

P->right = M->right;

}

// Finish attaching where m is

P->left = M->left;

}

// p should have same color as m since it's going where m was

if (P) P->color = M->color;

}

// Fix pm child link to be p

if (M) {

if (PM) {

if (PM->left == M) PM->left = P; else PM->right = P;

}

else root = P; // New root, possibly null

}

return root;

}

Figure 11.23 shows the tree given in Figure 11.18 as it is destroyed by removing nodes in the sequence *e-f-d-g-c-h-b-i-a*.

The insertion and deletion routines given in this chapter are complicated by the need to test for boundary conditions, such as non-existent parents, grandparents, children, or siblings. You'll notice explicit tests for null pointers in the code to handle these conditions. The routines can be simplified somewhat by creating a special sentinel node to be used for all otherwise non-existent nodes. This node can be colored black, and all null child links can be modified to point to this sentinel. For more details on using artificial nodes like this, see [GuibSedg 78].

If you need to ensure that a binary search tree never degenerates to worst-case scenarios, then a 2-3-4 tree, as implemented using a binary red-black tree, is the data structure of choice. Recall that 2-3-4 trees are B-trees of order 4. It's possible to generalize the technique behind red-black trees to work for B-trees of any order, by having links with more than two color choices. However, it's not really beneficial to do this; it's better instead to implement B-trees directly with multi-way nodes. This is due to the I/O paging characteristics of most disk devices. For example, there will be some page size that corresponds to a B-tree node of a certain order that will be optimal for any given disk. We'll explore B- trees on disk in Chapter 14. Just remember that red-black trees are efficient implementations of B-trees of order 4 that happen to perform well in memory.