*In September 1996, you saw how a simple tree structure can
improve the performance of a resizable-array class. The idea was not
to make every operation fast, but to keep the average cost low by
making expensive operations infrequent. That same idea appears again
this month, as John shows us how a roughly balanced tree of
circularly linked lists can be used to implement a very fast heap.
*

*--Tim Kientzle*

Heaps are simple, useful containers. A heap allows you to insert elements and extract the least element. Many heap implementations also allow you to decrease an element already in the heap, and there are a variety of less-common operations. Table 1 shows the operations supported by the Fibonacci heap implementation I'm going to discuss in this article.

One of the first uses for a heap was to implement a priority queue.
In fact, before researchers began realizing the wealth of algorithms
to which a heap is well suited, the heap was actually called a
"priority queue." Priority queues are used to keep a dynamic list of
tasks of differing priorities. The *Insert()* operation adds a
new job to the queue. The *ExtractMin()* operation extracts the
highest-priority task. If a job suddenly required a higher priority,
the *DecreaseKey()* operation would be used.

A heap can be used to implement a worst-case O(n log) sorting
algorithm, which is the best-possible rating for a sort that uses
only key comparisons. You simply insert all elements into the heap,
then *ExtractMin()* until the heap is empty. The items will be
extracted in ascending order. Heaps can also be used to construct
optimal binary trees for Huffman compression, as shown in Mark
Nelson's article "Priority Queues and the STL" (*Dr. Dobb's
Journal*, January 1996).

Function | Description |
---|---|

Insert() | Inserts a new node into the heap; the new node must contain a key value. |

ExtractMin() | Returns the node of minimum value after removing it from the heap. |

DecreaseKey() | Assigns a new, smaller key value to a node; the node may need to be repositioned in the heap so that it is extracted when there are no nodes with lesser values than the NEW key value. A pointer to the node must be given because heaps don't support an efficient search operation. |

Union() | Creates a new heap by joining two heaps given as input. |

Minimum() | Returns a reference to the node containing the minimum key value or some indication of what the key value is. |

Delete() | Deletes any node from the heap. A pointer to the node must be given. |

Table 1: Heap operations. |

One important algorithm that uses a heap is Dijkstra's shortest-path algorithm. This algorithm finds the least expensive path through a weighted graph (also called a "network").

Dijkstra's algorithm sets the cost of each vertex (except the starting vertex) to infinity and puts all the vertices onto a heap. You then extract the cheapest vertex from the heap -- call it M -- and examine each vertex A adjacent to M. If the cost of M plus the cost of the edge joining M to A is cheaper than the current cost of A (that is, if there's a cheap path to A through M), you create a link from A to M and decrease A's key to represent the new cost. You continue extracting successive nodes until you reach T, the target vertex. The value of T is the cost of the shortest path. The links from T back to the starting vertex indicate the shortest path.

Input: Graph G, vertices S (start), T (terminate) Declare: H (initially empty heap) 1: For all vertices v 2: if v == S then v.cost := 0 3: else v.cost := infinity 3: Insert v into H 4: Repeat 5: M := ExtractMin(H) 6: For each vertex A attached to M 7: w := cost of edge from M to A 8: if (M.cost + w < A.cost) 9: DecreaseKey(A,M.cost + w) 10: A.backlink := M 11: Until M = T 12: Output T.cost 13: Output vertices on chain of backlinks from T to S |

Figure 1: Dijkstra's shortest-path algorithm. |

As you can see in Figure 1, the *DecreaseKey() *on line 9 is the
most time-consuming operation of the inner loop. Since Dijkstra's
algorithm is important in network routing and other applications, it
would be nice to find a heap implementation that makes this
operation as fast as possible. This is the primary motivation for
the Fibonacci heap.

Heaps are usually implemented using binary trees, with the property
that for every subtree, the root is the minimum item. In 1984,
Michael Fredman and Robert Tarjan described a new way to implement
heaps. In particular, the Fibonacci heap, or F-heap, is
exceptionally fast at the *DecreaseKey()* and *Insert()*
operations. For a typical binary heap, these operations require
logarithmic time, while a Fibonacci heap requires only constant
time. Table 2 shows how other operations compare.

There are some caveats, however. First, a Fibonacci heap uses more
memory than a binary heap, since it requires a number of additional
data elements to keep track of the items. Second, the time
efficiency of the different operations is based on "amortized
analysis." With the binary heap, the time reflects the total time
for that operation. For an F-heap, a particular call to
*ExtractMin()*, for example, might take a very long time
because it's doing leftover work from other operations. With
amortized analysis, this additional work is accounted for in the
operations that cause the work.

The Fibonacci heap stores all elements in a collection of circular,
doubly linked lists. In addition to the *Left* and *Right*
pointers used to implement the lists, each node has a *Child*
pointer that allows it to be the parent of another circular list and
a corresponding *Parent* pointer. There is also, of course, the
application-specific key value and three variables called
*Degree*, *Mark*, and *Negative* *Infinity.*

At any given time, there is a *MinRoot* pointer that points to
the smallest item on the top list. In addition, if any item has a
child list, that item is smaller than anything on the child list.
Insertion is simple: You add the new item beside the current
*MinRoot* and possibly adjust *MinRoot* if the new item is
smaller. This always takes constant time.

The *Union()* operation is used by other heap operations. It
combines two heaps by linking the top-level lists into a single
circular list, much the way that two soap bubbles join at a point
then expand to form one larger bubble. Again, this linkage can
always be done in constant time.

The *ExtractMin()* operation is the real workhorse of the
Fibonacci heap. After extracting the current *MinRoot* (and
unioning its child list into the top level), the entire top-level
list is traversed, both to find the new *MinRoot* and to
rearrange the entire heap into a more efficient structure. Each node
has a *Degree* variable that indicates the number of direct
children of that node. The heap is rearranged so that no two nodes
on the top list have the same degree. One interesting property is
that the total number of descendants of any node will be about
2*Degree*. For example, suppose you insert 12 nodes into a
Fibonacci heap. The heap then looks like a single circular list.
After you extract the minimum element, there will be 11 nodes left.
Three of these will be on the root list, with degrees of 0, 1, and
3. The node of degree three would be the root of a subtree with a
total of eight nodes.

The detailed operation of *ExtractMin()* involves traversing the
top-level loop and keeping an array with one element for each
possible degree value; see Listing One.
As you traverse the loop, you put a pointer to that element into the
array. If there's already an element of that degree, you add the
larger element to the child list of the smaller, which increases the
degree of the smaller. This smaller element is then put back into
the array, which may require repeating the process. Figure 2**
**shows how this works.

*DecreaseKey()* specifies a node in the heap and a new, smaller key
value. If the key value is larger than the parent, then that node
must be moved up in the heap. The Fibonacci heap simply removes the
node from its current parent (reducing the parent's degree), and
inserts it into the root list, where it could become the new
*MinRoot*. The Fibonacci heap then does some additional work to
try to maintain the overall balance of the heap; see Listing
Two.

The Fibonacci heap relies on maintaining a certain rough balance to
its structure, and moving many items from deep in the heap to the
top can upset that balance. This is the purpose of the *Mark*
field. Whenever a child is moved out of a parent during a
*DecreaseKey()* operation, the parent's *Mark* field is
set. If the *Mark* field is already set, it indicates a second
child being lost, so the parent is also moved to the top. This
"cascading cut" must then consider the next node up, which is now
losing a child. This process stops when it reaches a node that is
not *Mark*ed, or when it reaches a node on the root list (a
node that has no parent).

To see why the *DecreaseKey()* operation manages to have
constant amortized time, even with this cascading cut, consider a
Fibonacci heap with N elements, and suppose you perform N
*DecreaseKey()* operations. Although any particular operation
could move a large number of subtrees, I want to show that the total
number of subtrees moved is proportional to N. How many times can
each node be visited during a cascading cut? The first time it's
visited, it will be marked. The second time, it will be moved to the
root list. So, each node can be visited twice before it becomes a
root node. Since the cascading cut always stops when it hits a root
node, the total number of root node visits can't be more than the
total number of cascading cuts. Thus, there are at most 3N node
visitations required for N *DecreaseKey()* operations.

Although the full analysis requires more complex methods to account
for mixing the various operations, this gives you some idea of how
*DecreaseKey()* can achieve constant amortized time.

The *Delete()* operation is actually remarkably simple. Each
node has a special Negative Infinity flag that forces it to have a
value smaller than any other node. To delete a node, you effectively
decrease the key to negative infinity, then extract it. Since
*ExtractMin()* is O(log), so is *Delete()*.

I've used the same basic design to build a dynamic binary heap. This
is an unusual thing to do. Usually, binary heaps are declared to be
of a certain size and don't grow or shrink. An expandable binary
heap must be constructed with great care to ensure that the
*Insert()* and *ExtractMin()* operations don't degrade to
O(n) behavior. Specifically, you must only grow the heap *log*
times, doubling the heap array size each time an expansion is
necessary. This will ensure that the nodes will be copied to a new
array a maximum of 1+2+4+...+*n*= 2*n*-1 times, so the cost
of expansion is constant per insertion. Likewise, the heap array
should be reduced to half its size when it is only one-quarter full.
If it is reduced to half size at precisely the point when it is half
full, then a sequence of *Insert()* and *ExtractMin()*
operations could alternate right at that moment, causing expansion
and shrinking at every operation.

The source code for this Fibonacci heap implementation (available
electronically; see "Availability," page 3) defines two classes:
*FibHeapNode* and *FibHeap*. The *FibHeap* class
provides the heap operations such as *Insert()* and
*ExtractMin()*. It is intended to be used as is; no subclassing
is required. The *FibHeapNode* class should be subclassed to
store your particular data and to redefine the virtual functions
used by *FibHeap*.

The file FIBTEST.CPP shows how this subclassing should be done. In
particular, when overriding the assignment, equality, and less-than
functions, you must call the corresponding protected base-class
function first. For example, *FHN_Cmp()* handles the Negative
Infinity test, so you should not do your own test if the base class
function indicates a meaningful value.

The F-heap provides a great enhancement to Dijkstra's algorithm and
to other algorithms that can use *DecreaseKey() *effectively.
The big surprise, though, is that the F-heap is even competitive on
regular heap algorithms like heap sorting. The test programs
generate random test data and compare the output against
*qsort()* to make sure the heap is operating correctly. To
determine the overall time complexity, you can compare the time for
2048-element and 1024-element data sets. If the process is O(n log),
the ratio should be 2.2 to 1. The test program averages a ratio of
2.17 for a binary heap and 2.07 for a Fibonacci heap. The decreased
ratio is due to the fact that more of the F-heap operations execute
in linear time, including the *Insert()* and
*DecreaseKey()* operations. In addition, the Fibonacci heap
test runs about 15 percent faster than the binary heap test. Even if
you remove the *DecreaseKey()* operation from the test, the
Fibonacci heap is still over 10 percent faster. For algorithms like
Dijkstra's, the difference can be two to ten times faster on a
1024-vertex graph.

Cormen, T., C. Leiserson, and R. Rivest. *Introduction to
Algorithms*. Cambridge, MA: MIT Press, 1990.

Fredman, M. and Tarjan, R. "Fibonacci Heaps and Their Uses in
Improved Network Optimization Algorithms.*" Journal of the
Association for Computing Machinery*, vol. 34, no. 3, July
1987.

**DDJ**

//=====================================================================// ExtractMin() - O(n) worst-case actual; O(lg n) amortized //===================================================================== FibHeapNode *FibHeap::ExtractMin() { FibHeapNode *Result; FibHeap *ChildHeap = NULL; // Remove minimum node and set MinRoot to next node if ((Result = Minimum()) == NULL) return NULL; MinRoot = Result->Right; Result->Right->Left = Result->Left; Result->Left->Right = Result->Right; Result->Left = Result->Right = NULL; NumNodes --; if (Result->Mark) { NumMarkedNodes --; Result->Mark = 0; } Result->Degree = 0; // Attach child list of Minimum node to the root list of the heap // If there is no child list, then do no work if (Result->Child == NULL) { if (MinRoot == Result) MinRoot = NULL; } // If MinRoot==Result then there was only one root tree, so the root list is // the child list of that node (NULL if this is the last node in the list) else if (MinRoot == Result) MinRoot = Result->Child; // If MinRoot is different, then the child list is pushed into a new temporary // heap, which is then merged by Union() onto the root list of this heap. else { ChildHeap = new FibHeap(); ChildHeap->MinRoot = Result->Child; } // Complete the disassociation of the Result node from the heap if (Result->Child != NULL) Result->Child->Parent = NULL; Result->Child = Result->Parent = NULL; // If there was a child list, then we now merge it with rest of the root list if (ChildHeap) Union(ChildHeap); // Consolidate heap to find new minimum and do reorganize work if (MinRoot != NULL) _Consolidate(); // Return the minimum node, which is now disassociated with the heap // It has Left, Right, Parent, Child, Mark and Degree cleared. return Result; } //==================================================================== // Consolidate(). Internal function that reorganizes heap