Suggested Assignments

1. a. Write programs to find and output optimal greedy trees and their weighted path lengths.

b. Explain why the greedy tree implementation with the list structure is linear in time. (Hint: When the next locally minimal triple is to be found, only list-structure records not previously accessed will be considered, as well as two records that were previously accessed.)

2. a. Write a function to read a file composed of the twenty-six letters of the alphabet, plus the blank character, the comma, and the period; produce an output file that is the encoded version of the input file. The Huffman code is to be used, with weights based on the number of times each character appears in the input text.

b. This is the same as (a) except that the "words" of the input text are taken as the objects to be encoded, and their number of occurrences is the basis for the code. Use any reasonable definition of a word. For example, take commas, periods, blanks, and any sequence of other characters to be words.

Note: For both parts, use a function getnexttoken. In part (a) it must return the next character in the input or EOF, while in part (b) it must return the next word or EOF. Also, there must be a table in which the tokens (characters or words) are stored. This table must be searched for a token, inserted into, and its entries updated. Your function must treat the table with these operations as a data abstraction, so the solutions to both parts should look the same except that the definitions of certain functions should change from one solution to the other.

c. Run your program for part (b) with the table implemented as

i. a binary search tree

ii. a hash table using double hashing

iii. a table composed of two parts梐n optimal binary search tree storing the 25 most frequently used English words (take the a's to be zero) and a binary search storing any other words that occur in the input. The optimal binary search tree is to be built in advance of the input.

d. Write a function to read a text file consisting of the twenty-six letters of the alphabet, plus the blank and period characters, and produce an output file consisting of the text in Huffman code. Use the heap implementation in generating the Huffman tree.