12.5.4 Alphabetic Codes

Consider the twenty-seven characters with relative frequencies given in Section 12.3 as weights. Order them as D < A < B < C < ?/FONT> ?/FONT> ?/FONT> < Z. Suppose we want to construct an extended binary search tree with these characters associated with the terminal nodes. We take the weights given there as the a's of this problem and take the b's to be zero. The extended binary tree will then generate what is called an alphabetic code. The optimal binary search tree will have weighted path length 5,200, and the greedy binary search tree will have weighted path length 5,307. Recall that the Huffman tree had weighted path length 5,124. Requiring use of a binary search tree rather than a binary tree results in an increase relative to 5,124 of only (5,200 - 5,124)/5,124 or 1.5 percent. The greedy construction yields an increase relative to 5,200 of (5,307 - 5,200)/5,200 or 2 percent.