10.8: Summary

Files are stored in secondary storage and provide powerful facilities for the temporary and permanent storage of large amounts of data. Magnetic tapes and magnetic disks are two important devices on which to store files. These are typical of sequential and direct access devices, respectively.

Because of the structure and resultant access capabilities of these devices, internal memory techniques must be modified in order to deal efficiently and conveniently with files stored on them. It is possible to sort efficiently, even with sequential files, by using external sorting techniques. Primarily, these are merge sorts, such as straight, natural, and polyphase sorts. Other merge sorts may also decrease sorting time. In general, sequential files are suitable for tasks that can be accomplished by traversing the records of the file. These tasks include sorting and updating master files.

It is possible to achieve efficient random accessing of records with direct access devices when access is by a specific key. Hash tables are useful for this purpose. To achieve a combined capability of random plus sequential access in sorted order by a specific key requires considerable file organization. This organization involves the creation of a file directory, which may consist of a number of levels. The directory guides the search efficiently to the desired record with specific key value. A number of directory implementations are possible; directories based on B-trees are becoming increasingly popular. Such directories may be searched relatively quickly, allowing efficient insertions and deletions while utilizing storage well.

At this point, you should have filed away the substance of this book. It is hoped that you have done so to allow quick retrieval, while maintaining room for insertions. Deletions may be necessary but should be minimal.