Data Structures Using C: 10.7.1 Sequential Access

10.7.1 Sequential Access

There is a natural way to store records sequentially on magnetic disk to minimize total seek time for a traversal through the records. This storage technique is based on the cylinder concept. It starts by filling blocks of the top surface track of the outermost cylinder with records. The records are in sequence within the blocks. When the track is full, the same process is repeated with the track of the next surface, and the next, and the next, until the lowest surface track is filled. At that time the outermost cylinder has been filled. Continuing in the same way, records are stored on each of the inner cylinders, starting with the next adjacent cylinder. This is similar to storing records sequentially on magnetic tape. In fact, we can picture the surfaces and cylinders laid out linearly as in Figure 10.9. This arrangement has all the inherent drawbacks of magnetic tape, whenever random accessing of records is required.

We can adapt disk storage to random accessing by taking advantage of the tab feature of the disk, which allows the head to go directly to any cylinder and surface. The problem is how to determine exactly where to go. If one knew that the record to be retrieved were the ith record, then it would be easy to calculate on which cylinder and surface it would be located. This is not often the case, since the search is conducted by means of key value, not by order in the sequence.

Figure 10.9 Sequential Access on a Magnetic Disk Pack

A linear search of the file could be done but might take O(n) time, when n records are stored in the file. This could take minutes. If the records are stored in sorted order by key value, a binary search of the file might be done. The difficulty here is that 1g n accesses are required. A million-record file would require twenty accesses, and each access might require a seek. While the binary search is better, it still might take seconds. This is reasonable for isolated requests, but during this time there may be many requests for desired retrievals. Waiting requests may build up interminally, with unlimited waiting time before all are processed. The need to insert or delete records further compounds the problem.

Just as in the case of internal memory, keeping records in blocks with pointers to succeeding blocks is a possibility, based on the concept of lists. This may avoid insertion and deletion difficulties, but it does not alleviate the problem of searching. We encountered this same problem with lists in internal memory. What about binary search trees? Even if balanced, a million records may need as much as 20 ?c seeks, where c is a small constant.