9.3.5 Searching in Ordered Hash Tables

If keys are entered into a hash table in sorted order, open-addressing and chaining techniques will always result in searches along a probe sequence or along a list, in which the keys appear in the same or reverse order, respectively. In both cases, the search algorithm may be easily modified so that whenever a key is encountered that precedes the search key in the relevant ordering, the search is immediately terminated unsuccessfully. This can significantly decrease search times for unsuccessful searches but will not affect successful search times. You can generate the new hash tables (a) through (g) of Figures 9.4 to 9.6 to see the effect on the tables of the keys appearing in sorted order.

Of course, one cannot normally control the order in which keys to be inserted appear. It is not difficult, however, to modify the insertion algorithms so that the resultant hash table is exactly the same as the hash table resulting when keys are given in sorted order. The idea of the algorithm is clever and simple. For open addressing, simply insert a key as usual unless, in following a probe sequence, a key is encountered that precedes the new key in the ordering. Then the search key is inserted in that position, and the insertion proceeds as though the replaced key were being inserted using the displaced key instead of the new key. For separate chaining, it is necessary to traverse the list at which a collision occurs, and insert the new key in its proper position of the list.

This technique does not work directly for coalesced chaining, because the insertion of a new key in a list may start from the interior of the list when a collision occurs. In order to keep coalesced lists in order, it is necessary to find the first list record or be able to traverse the list in either direction. To achieve this, the lists might be implemented as circular lists or two-way lists. Circular lists take more traversal time but less storage than two-way lists.

Keeping ordered hash tables yields significant improvements in the average number of probes required for unsuccessful searches with open addressing. These improvements are especially striking for high usage ratios.