# CHAPTER 14: B-TREES

In this, the final chapter of the book, we'll discuss B-trees, which are balanced multi-way trees. B-trees have become commonplace in database applications because they offer fast disk-based searching. B-trees were invented by Bayer and McCreight (see [BayeMc 72]) and are a natural evolution of earlier database designs. The basic premise behind these designs is that, to access a disk efficiently, it's best to load chunks of data, called pages, rather than small individual items.

## PROPERTIES OF B-TREES

B-trees are classified by their order, which refers to the maximum number of branches in a node. For example, in a B-tree of order 4, each node can have up to 4 branches. Corresponding to these branches are 3 keys that help determine which branch to take during a se arch. In general, a B-tree of order m has nodes with up to m branches and m-1 keys. Note that the order specifies the maximum number of branches. A node may have fewer branches than the maximum. Figure 14.1 shows a B-tree of order 4, with all nodes full. You've seen B-trees of order 4 before. They are the 2-3-4 trees we introduced in Chapter 11. (We've drawn them differently in this chapter, however.)

Except for the root node, all nodes must have at least (m-1)/2 keys and (m-1)/2 + 1 branches. This means all nodes except the root are roughly half full. For example, in an order 5 B-tree, all non-root nodes must have at least two keys. For an order 4 B-tree, the nodes must have at least one key.

The leaves of the tree are always on the same level.

Note Some texts define the order of a B-tree differently than we have here. In the alternative definition, a B-tree of order d has nodes with a maximum of 2d keys and 2d+1 branches. Implied in this definition is that the maximum number of keys is always even, and the minimum number of keys in a non-root node is d. In our definition, the maximum number of keys can be odd.

As was the case with 2-3-4 trees, the key to efficient B-trees lies in maintaining the two properties just mentioned. Doing so will yield balanced trees that are reasonably compact.

## MULTI-WAY NODES

At the heart of B-trees are the multi-way nodes that make up the trees. Multi-way nodes are essentially generalizations of binary nodes. Rather than show generic template-based nodes in this chapter, we'll show a direct design that's typical of what might be used in practice. (You're probably tired of seeing templates, anyway.)

For keys, we'll use a character string of some maximum size. Stored with each key is a long integer that represents the associated data. While in some applications you might be able to fit all of your data in the long integer, the intent is for this variable to be the file address of the data--as stored elsewhere. For example, you might store the B-tree in one file that serves as an index to another file holding the actual data.

## THE B-TREE CLASS

// Function return status codes

const int NODE_OVERFLOW =  2;

const int SUCCESS       =  1;

const int FAIL          =  0;

const int DUPLKEY       = -1;

const int ALLOCERR      = -2;

struct BtreeHeader  {  // Stored with every tree

unsigned order;

unsigned num_entries;

unsigned num_nodes;

int height;

};

class Btree  {         // Btree file class

protected:

FmgrPtr f;          // File the Btree is connected to

Cache<Mnode> cache; // Node cache

Coptr<Mnode> root;  // Pointer to the root node

void WriteHdr( );

int Insert(Entry &e, Coptr<Mnode> t);

int Delete(Entry &e, Coptr<Mnode> t);

void RestoreBalance(Coptr<Mnode> p, int posn);

void RotateRight(Coptr<Mnode> p, int posn);

void RotateLeft(Coptr<Mnode> p, int posn);

void Merge(Coptr<Mnode> p, int posn);

static void PrintNode(Coptr<Mnode> &n);

static void PrintTree(Coptr<Mnode> t, int sp);

public:

Btree(int cs = 8);

~Btree( );

// File-management routines

int Connect(FmgrPtr &fptr, int create,

void Disconnect( );

int Open(char *fname, Fmgr::AccessMode mode,

void Flush(int clear=0);

void Close( );

int Search(Entry &e);

int FullSearch(const Entry &e):

int Remove(char *k, long d = 0);

int IsEmpty( ) const;

int IsOpen( ) const;

int OK( ) const;

int operator!( ) const;

operator int( ) const;

void ClearErr( );

void Statistics(int full);

void PrintTree( );

};

Complete code for the B-tree class is given on disk in the files btree.h and btree.cpp test program is provided in the file btreet1.cpp.

Each B-tree has its own cache of buckets that store Mnodes. Also, a Coptr to the root of the tree is kept for easy access. The constructor is used to initialize these members, as well as to set the file manager pointer to null at the outset:

Btree::Btree(int cs)

// Create a Btree with a null file manager, and with a

// cache size of cs

: f(0), cache(cs), root(cache)

{

}

## SEARCHING A B-TREE

int Btree::Search(Entry &e)

// Search the tree for the first node having a matching

// entry (i.e., keys must match). If found, the data

// field of e is filled in. Returns SUCCESS or FAIL.

{

Coptr<Mnode> t(root);

int rv, posn;

while(t)  {

rv = t->Search(e, posn);  // Search node t, get back posn

if (rv == 0) {            // Found a match

e = t - >entry[posn]; // Causes the data to be filled in

return SUCCESS;

}

t = t -> Branch(posn);    // No match, keep walking

}

return FAIL;

}

Entry e("look for me");

if (mytree.Search(e) == SUCCESS)  {

cout << "Entry found, with data: " << e.data << '\n';

}

## INSERTION INTO A B-TREE

In Chapter 11, we described how to insert nodes into a 2-3-4 tree. The method involves inserting a new entry into a leaf and, if the appropriate leaf is full, it is split in two, with node splits propagating up the tree as necessary. Not surprisingly, a similar method is used for general B-trees. The only drawback is that it isn't possible to define an efficient, single-pass, top-down algorithm for general B-trees. (At least, nobody has come up with such a method.) Instead, we'll use a two-pass recursive algorithm.

First, we walk down the tree until we either find a matching key and data pair (which means we have a duplicate entry--not allowed) or we hit a leaf node. At this point, we attempt to insert the new entry into the leaf. If the leaf isn't already full, we can simply do the insertion. Otherwise, the node is split in half. For this technique to work, we must send one of the entries up the tree to be inserted into the parent. We always use the median entry (as defined in sorted order) for this purpose. Note that the median may actually be the entry we tried to insert in the first place.

If the parent is full and can't handle the median entry, the parent must be split as well, with the parent's median entry (which may be different from the original median entry) sent further up the tree. Eventually, the root node might have to be split in half, and a new root created to contain the median entry. Due to the bottom-up node splitting, B-trees always grow at the root. This keeps the leaves all at the same level, and results in a well-balanced tree.

Figure 14.3 shows some insertions when the corresponding leaf nodes aren't full. Figure 14.4 shows a node being split to accommodate a new key. Figure 14.5 shows an example of two splits occurring, one which involves growing a new root.

#### (d) Median inserted into parent

When Insert( ) returns a NODE_OVERFLOW code, this means the node below was split, and the median entry is passed back for further processing.

#### (b) After two splits, including the root

Eventually, the top of the tree is reached, and the last call to Insert( ) is unwound. At this point, Add( ) checks to see if the node that was split was the root. If so, a new root is created, with the median entry passed back as the root's sole entry.

// Creates a new entry with key k and data d, and attempts to

// add the entry to the tree. Returns SUCCESS, DUPLKEY, or

// ALLOCERR.

{

Entry e(k, d);

int rv = Insert(e, root);

if (rv == NODE_OVERFLOW) { // Need to grow a new root

Coptr<Mnode> old_root(root);

new(root) Mnode;

if (root == 0) return ALLOCERR;

bh.num_nodes++: bh.height++;

root->left = old_root;

root->InsEntry(e, 0);

rv = SUCCESS;

}

if (rv == SUCCESS) bh.num_entries++;

return rv;

}

int Btree::Insert(Entry &e, Coptr<Mnode> t)

// Recursive function that tries to insert entry e into subtree t.

// Returns SUCCESS, DUPLKEY, ALLOCERR, or NODE_OVERFLOW.

// If NODE_OVERFLOW, then e becomes the median_entry to pass

// back to t's parent.

{

int rv, posn;

if (t == 0) {

// We went beyond the leaves. Pretend that we've overflowed.

e.right = 0;

return NODE_OVERFLOW;

}

// Search the node t. If entry found, send back duplicate key error.

rv = t->FullSearch(e, posn); // Use both key and data

if (rv == 0) return DUPLKEY;

// Entry not found; go down to the appropriate leaf to store entry

Coptr<Mnode> child(t);

child = t->Branch(posn);

t.Release( );            // Prevents cache from filling up during recursion

rv = Insert(e, child);   // Recursive call

// Back from recursion;  see if we have an overflow

child.Release( );        // Prevents cache from filling up

if (rv != NODE_OVERFLOW) return rv;

// We have overflowed the node below us. See if we can

// insert the entry e (now the median) into node t.

posn++;  // Move to the appropriate entry for insertion

if (!t->IsFull( )) {     // Simply add entry to node t

t->InsEntry(e, posn);

t->SetDirty( );

return SUCCESS;

}

else {  // Need to split node t.

// Create new node to be the right node. t will be the left node.

Coptr<Mnode> r(t);

new(r) Mnode;

if (r == 0) return ALLOCERR;

bh.num_nodes++;

// Move roughly half the nodes to the right node. If inserting

// entry into left node, we need to move the split position back

// one, so that when the median is taken, the split node will have

// enough entries.

int split_posn = NBR/2;

if (posn < split_posn) split_posn-;

t->Split(*r, split_posn);

t->SetDirty( );

// Determine where entry e should go

if (posn > split_posn) {  // e goes to the right node

// First entry of right node is the median

Entry median_entry = r->entry[0];

r->DelEntry(0);

// Insert e into its correct spot

r->InsEntry(e, posn-split_posn-1);

e = median_entry;     // For passing up the tree

}

else if (posn < split_posn) { // e goes to the left node

t->InsEntry(e, posn);  t->SetDirty( );

e = r->entry[0]; // Get median from the right

r->DelEntry(0);

}

// At this point, e is median entry to add to parent.

// Record what e's right pointer is going to be.

r->left = e.right;

e.right = r;

return NODE_OVERFLOW;

}

}

As you examine the Add( ) and Insert( ) routines, note how Coptrs allow us to write the routines almost as though we were working with trees in memory. The process isn't quite automatic, as we've explained in Chapter 13. For example, before a recursive call, we must release any Coptrs that might be bound--to prevent the cache from becoming full.

We must also remember to set the dirty bits for any nodes that might be modified. We've made it a convention to set dirty bits each time we modify a node, even though the dirty bits might already have been set due to prior modifications. With careful analysis, it's possible to optimize the setting of the dirty bits. But when you first write the code, the redundancy is well worth it. If you forget to set a dirty bit, the code will eventually fail, but the problem may remain undetected for a long time and may be difficult to locate.

## B-tree Rotations

This borrowing process is interesting because, in effect, we end up performing tree rotations that are the generalized versions of what we used for binary trees. Figure 14.6 illustrates both left and right rotations. When doing a left rotation, we always borrow the leftmost entry of the right sibling. In a right rotation, the rightmost entry of the left sibling is used.

When both the left and right siblings have entries to spare, it's arbitrary which sibling you borrow from. In the code we'll present later, we always choose the left sibling. In some cases, neither sibling has any extra entries. When this happens, we can combine the given node with one of the siblings. (Again, the choice is arbitrary.) This merging process involves an entry from the parent node as well.

## Merging Nodes

Figure 14.7 illustrates two examples that merge a node with its left and right siblings. In this figure, we've assumed the nodes shown are in the middle of a B-tree. As we did with the red-black trees defined in Chapter 11, we've represented external nodes (which are subtrees that may be null) using links that go nowhere and that are labeled with numbers. These numbers help you to easily see how the links are re-arranged. Note that, in the case of a left merge, the original parent entry has its right link set to the left link of the right node. Thus, we don't lose track of any nodes below ones involved in the merge. Right-hand merges have a similar linking process.

## The Deletion Functions

#### (c) Final tree

The recursion bottoms out when an entry is to be deleted from a leaf node. At this point, the entry is removed from the node. Then, RestoreBalance( ) is called to do whatever rotation or merging is necessary to ensure that the node with the deleted entry has at least the minimum number of entries. As the recursion in Delete( ) unwinds, the merging is propagated up the tree as needed. Eventually, the program execution returns to the Remove( ) function. A test is made to see if the root node has become empty due to a merge that took place below it. If so, the root is removed from the tree and the node to the root's left becomes the new root. If the old empty root is the only node in the tree, it is left empty. Thus, we also have at least one node in the tree, even if it's empty.

Here are the six functions used for deletions:

int Btree::Remove(char *k, long d)

// Deletes entry having key k and data d from the tree.

// Returns SUCCESS, or FAIL if we couldn't find the entry.

{

Entry e(k, d);

int rv = Delete(e, root);

if (rv == SUCCESS && root->IsEmpty( ) && bh.height > 1) {

// We need to shrink the tree

Coptr<Mnode> p = root;

root = root->Branch(-1);

bh.num_nodes-; bh.height-;

p.Delete( );

}

return rv;

}

int Btree::Delete(Entry &e, Coptr<Mnode> p)

// Recursive function that deletes entry e from the subtree

// with root p. Returns SUCCESS, or FAIL if we couldn't find

// the entry.

{

Coptr<Mnode> t(p), s(p);

int rv, posn, sr;

sr = p->FullSearch(e, posn); // Search node p

t = p->Branch(posn);         // Get ready to walk down the tree

if (sr == 0) {

// p has entry to delete. Replace the entry with its

// successor, if there is one.

if (t) {

s = t;

while(s->Branch(-1)) s = s->Branch(-1);

p->entry[posn] = s->entry[0];

p->Branch(posn) = t;      // Remember to restore the branch

p->SetDirty( );      // Don't forget

// Now, using recursion, delete the successor entry.

// Release the coptrs in use, to keep cache from.

// filling up during the recursion, and to prevent

// dangling pointer exceptions

p.Release( );

Entry &se = s->entry[0];

s.Release( );

if (Delete(se, t) != SUCCESS)

excp->THROW(ASSERTERR); // We should have found the entry

}

else { // At a leaf, so the deletion is easy

p->DelEntry(posn);

p->SetDirty( ); // Don't forget

bh.num_entries-;

}

rv = SUCCESS;

}

else {

if (t) {

// Walk down the tree, looking for entry

p.Release( );   // Keep cache from filling up on recursive call

rv = Delete(e, t);

}

else return FAIL;  // Couldn't find said entry

}

if (rv == SUCCESS && t && t->IsPoor( )) {

// The node below p doesn't have enough entries. We need to

// move some entries to it and thereby restore the balance.

t.Release( ); // To prevent dangling ptrs

RestoreBalance(p, posn);

}

return rv;

}

void Btree::RestoreBalance(Coptr<Mnode> p, int posn)

// Node down branch at position posn in node p has one too

// few entries. Give it an entry from either its left or right

// sibling, or perhaps just merge the node with a sibling.

{

Coptr<Mnode> t(p);

if (posn == -1) {

// We have no left sibling, so try the right sibling

t = p->Branch(0);

if (t->IsPlentiful( ))  { // Can we borrow from the right?

RotateLeft(p, 0);     // Do so

}

else {          // Right sibling has no spare entries

t.Release( ); // To prevent dangling ptr errors

Merge(p, 0);  // Merge with right node and parent entry

}

}

else if (posn == p->LastPosn( )) {

// We have no right sibling, so try the left sibling

t = p->Branch(posn-1);

if (t->IsPlentiful( )) {  // Can we borrow from the left?

RotateRight(p, posn); // Do so

}

else { // Left sibling has no spare entries.

t.Release( );   // To prevent dangling ptr errors

Merge(p, posn); // Merge with left node and parent entry

}

}

else {

// We have both left and right siblings

t = p->Branch(posn-1);

if (t->IsPlentiful( )) {  // Can we borrow from the left?

RotateRight(p, posn); // Do so

}

else {

t = p->Branch(posn+1);

if (t->IsPlentiful( ))  {  // Can we borrow from the right?

RotateLeft(p, posn+1); // Do so

}

else {

// Neither the left or right sibling has spare entries.

// Merge arbitrarily with the left node.

t.Release( );   // To prevent dangling ptr errors

Merge(p, posn); // Merging with the left

}

}

}

}

void Btree::RotateRight(Coptr<Mnode> p, int posn)

// Performs a "right rotation" using the entry at node p,

// position posn as the pivot point. Assumes p is not

// a leaf and that a left and right child exist.

// Also assumes right child isn't full, and that p and

// left child aren't empty.

{

Coptr<Mnode> l(p), r(p);

r = p->Branch(posn);   // Right child of p

l = p->Branch(posn-1); // Left child of p

// Move entry from parent p into rightmost entry of right node.

// This entry will have the old left branch of right node. (The

// left branch will be updated later.)

p->Branch(posn) = r->left;       p->SetDirty( );

r->InsEntry(p->entry[posn], 0);  r->SetDirty( );

// Now, move rightmost entry of the left node into p. This

// entry's branch becomes the right node's left pointer.

// Be sure to point entry to right node.

int last_posn = l->LastPosn( );

r->left = l->Branch(last_posn);        r->SetDirty( );

l->Branch(last_posn) = r;              l->SetDirty( );

p->entry[posn] = l->entry[last_posn];  p->SetDirty( );

l->DelEntry(last_posn);                l->SetDirty( );

}

void Btree::RotateLeft(Coptr<Mnode> p, int posn)

// Does a "left rotation" using the entry at node p,

// position posn as the pivot point. Assumes p is not

// a leaf and that a left and right child exist.

// Also assumes left child isn't full, and that p and

// right child aren't empty

{

Coptr<Mnode> l(p) , r(p) ;

r = p->Branch(posn) ;    // Right child of p

l = p->Branch(posn-1) ;  // Left child of p

// Move entry from parent p into leftmost entry of left node.

// This entry gets the left pointer of the right node.

p->Branch(posn) = r->left ;       p->SetDirty( ) ;

l->Concatenate(p->entry[posn]) ;  l->SetDirty( ) ;

// Now, move rightmost entry of the right node into p. Make

// sure this entry points to the right node. Also, we have

// a new left branch of the right node.

r->left = r->Branch(0) ;

r->Branch(0) = r ;              r->SetDirty( ) ;

p->entry[posn] = r->entry[0] ;  p->SetDirty( ) ;

r->DelEntry(0) ;                r->SetDirty( ) ;

}

void Btree: :Merge(Coptr<Mnode> p, int posn)

// Merges the node on the branch left of the entry at position

// posn of node p with the entry of p, and the node on the

// branch to the right of the entry of p. Assumes posn in range.

{

Coptr<Mnode> l(p) , r(p) ; // Ptrs to left and right children

r = p->Branch(posn) ;     // Right child of p

l = p->Branch(posn-1) ;   // Left child of p

// (1) Fix entry of p to point to left branch of r

// (2) Insert this into leftmost entry of l

// (3) Add all of r's entries

// (4) Delete entry from p, then delete r

p->Branch(posn) = r->left ;       p->SetDirty( ) ;

l->Concatenate(p->entry[posn]) ;  l->SetDirty( ) ;

l->Concatenate(*r) ;              l->SetDirty( ) ;

p->DelEntry(posn) ;               p->SetDirty( ) ;

r.Delete( ) ;

bh.num_nodes- ;

}

## BEYOND BASIC B-TREES

Searching and inserting into a variable-order B-tree is about as simple as it is for trees with fixed orders. However, deletion is more complicated (as if it weren't complicated enough). With fixed-length keys, the rotation and merge operations are fairly straightforward. We can always guarantee that one key can take the place of another, because they all have the same size. With variable-length keys, such replacements can't be guaranteed. For example, if we attempt to do a rotation when the sibling key is larger than the parent key, theres no guarantee that the sibling key will fit in the parent node.

We designed the insertion and deletion routines using recursive functions. It's possible to use an explicit stack instead, and thus convert the recursion into iteration. Each stack element must have both a node pointer and an index representing which entry in the node we are on. If you use Coptrs for the node pointers, they must be released before putting them on the stack. Because the bucket pointers for released Coptrs will still be intact, there's a good chance that when you pop the Coptrs off the stack, you can quickly rebind them to their respective buckets.

There's an important advantage that results from using an explicit stack. You can have a built-in iterator, which allows you to move back and forth through the entries in the tree in sorted order. This feature is particularly handy in such applications as a customer contact list. For example, you can search for a customer record using the customer name, and then quickly scan back and forth through other customer records in sorted order.

For information about obtaining source code for a variable-order B-tree class with a built-in iterator, see the README file on disk.

## SUMMARY

B-trees, or variations of them, have become the data structures of choice for database applications. B-trees allow fast database searching, due to the ability to optimally size the nodes to the paging requirements of a file system, and due to the relative flatness that results by using multi-way nodes. By using a cache in conjunction with the B-tree nodes themselves, you can speed up the searching even more.

While B-trees were designed with files in mind, it's possible to use memory-based B-trees. However, you're probably better off using red-black trees for trees that totally reside in memory. That's because the larger nodes of a B-tree don't provide an advantage for memory-based designs. You must search sequentially through the entries of a node, looking for the appropriate branch to take. For nodes with many entries, this can become time-consuming. (With file-based trees, searching within a node is often masked by the relatively slower file access needed to retrieve the node.) Of course, you could use a binary search to find the appropriate location in the array of entries. However, you're not much better off than if you had simply used a binary tree.