10.4: Sequential Access in External Memory

A range of devices is now available for use as secondary storage, the most common being magnetic tapes and disks. Tape cassettes like those used with stereo equipment are prototypical of the magnetic tape used for secondary storage. If a song is being played on such a tape, the time required to access another song on the tape depends on how far away from the current position that song is. This accessibility is the same for magnetic tapes used as secondary storage. Such a tape can be pictured as containing a sequence of fixed-length blocks (Figure 10.4). Each block contains a fixed number of records of a file (say, n per block). One or more input buffer areas (reserved storage areas) are set aside in internal memory, each large enough to hold one block of records. These buffers are required in order to allow for efficient reading and writing of files.

Figure 10.4 Sequential Access on Magnetic Tape

Suppose one buffer is set aside. To read a block of records into the buffer, the tape must be moved at sufficient speed past a read head. This speed determines the rate at which data are transferred to the buffer. If a sequential file is stored on magnetic tape, executing the open statement causes the buffer storage to be allocated, the tape to be rewound so that the first block may be read, and the first block to be read into the buffer. After these operations have occurred, the buffer and the file pointer can be visualized as in Figure 10.5.

If a read is executed, the first record is copied from the buffer into its memory location, and the file pointer is advanced to the next buffer position. Consecutive executions of a read for this file cause similar actions. Finally, when n such reads have occurred, the next block of records from the tape is placed into the buffer, the file pointer is automatically repositioned at the first record of the buffer, and the process is repeated. Had two buffers been set aside, as the current buffer is being read the other would be filled with the next block of file records. A similar procedure takes place for the writes with respect to an output buffer for the file. In this way the buffer is being filled, when reading from a file (or emptied, when writing to a file), at the same time that the computer is carrying out other commands of a program. Record blocking is done because n records can be read or written as a block much faster than individually, since it reduces tape starts and stops. Today's computers, whether mainframe or personal, can execute on the order of 100,000 commands in the time it takes for a block to be read into or written from the buffers. A program may actually process n records of the file while the next block is being transferred to the buffer. For example, the records may be sorted or individually processed during this transfer time. In this case, the time required to process the entire file is determined by the time required to scan the file in this fashion. Thus the number of accesses to secondary storage determines the execution time of the program. This may take minutes, which is a very long time in computer terms.

Figure 10.5 Input Buffer for the File

It should now be clear why sequential files should not be used when records must be accessed in an order other than the order in which they appear on the file. Attempting to access the records in random order may mean spending great amounts of computer time, since minutes may be required for each access. Instead, the typical use for sequential files is when records are to be processed in turn, starting from the first record. In other words, when the processing to be done can be accomplished by a traversal through the records of the file, then sequential files are appropriate. This is typical of external sorting, copying, updating, and printing (such as end-of-the-month billings).

As a practical example, suppose you are given the master file, bookinventory, of Example 10.3, and another transaction file, booktransactions. Booktransactions consists of records representing the arrival of new books for inventory and the sales of books in the last week. If both files are sorted on the same key (say, title), it is possible to traverse each of the files once and update each master file record to reflect new arrivals and sales. A new file must be created for this updated master file. This step would actually be advantageous for security purposes, since the old master file and booktransactions file could be saved. If anything were to happen to the new master file, the procedure could be repeated and the file reconstructed.