3.1: Why Are Lists Needed?

Imagine you are a financial wizard and want to store information about your stock portfolio as records. Table 3.1 lists the information. How should these records be stored? The book so far has shown how to store them in

Individual storage elements with individual names

An array, so they can be referenced by position or from a pointer array

Dynamic memory, where they can be referenced by individual pointer variables or from an array of pointer variables

Having more than a few records rules out using individual names, a method that would also be inadequate because it does not allow the records to be treated as a group. The same is true for individual pointer variables.

The other three methods, two kinds of arrays and dynamic memory, allow not only grouping but also selection of records, as well as traversal through all the records. Figure 3.1 represents storage of the stock portfolio in each of these three ways. The three methods are called (a) sequential arrays, (b) sequential pointer arrays, and (c) sequential pointer variable arrays. Collectively they are known as sequential array methods for record storage.

The sequential array methods are convenient for recording the daily transactions common in the buying and selling of stocks, but only when records are kept in arbitrary order. A new record can then be inserted directly as the new last record. This means that in the method of sequential arrays (Figure 3.1(a)), a new record is inserted after Sears. In the method of sequential pointer arrays (Figure 3.1(b)), it can be placed in any unused slot, but the corresponding pointer goes in the seventh position of p. For the method of sequential pointer variable arrays (Figure 3.1(c)), the function malloc provides dynamic storage that is available to store the record, but again, its associated pointer variable value also goes in position 7 of p.

Table 3.1 A Stock Portfolio

Name     # Shares    Value     Date Bought

Apple      300     $ 7,512.46    3/15/84

CBS        100       8,000.55    6/16/83

Digital    200      18,400.00    9/20/83

IBM        100      11,213.25    1/20/83

IBM        200      21,400.00    4/17/84

Sears      100       3,100.00    2/10/82

Sears      100       3,200.00     4/6/84

Usually order does matter. In reality the stocks are kept in alphabetical order. In this case the sequential array methods provide neither easy nor rapid insertion or deletion capability for the records. Suppose Chrysler stock is bought, and its new record must be inserted into the portfolio; Chrysler must be inserted in its proper place as the new third record. In sequential arrays this means that all five records below its new position must be moved down. In the two pointer array methods, this means that the five pointers below its new location (position 2) must be moved down. The pointer array methods afford flexibility of record storage position, but they still require shifting, although it is pointers that are moved rather than records. Shifting takes time proportional to the number of records moved. If n records are to be inserted, then the total time can be proportional, not to n, but to n2. For example, suppose the n records each require insertion at the top. Even if there are no records stored initially, insertion of n records requires 0 + 1 + 2 + . . . + (n - 1) moves. The time adds up to n(n - 1)/2, and so the time is O(n2). Even if each insertion requires only half the information to be shifted, the time is reduced merely by a factor of 2; it is still O(n2). When n is large, this can take significant time. Of course, for one trader, n will be small, but for larger brokerage houses n might be hundreds of thousands. If they were not designed to be efficient, these insertions might take hours or even days ((300,000)2 operations ?/FONT> 10-6 seconds per operation = 25 hours).

(a) Sequential Array

(b) Sequential Pointer Array

(c) Sequential Pointer Variable Array

Figure 3.1 Sequential Methods of Storing Records in Arrays

Selling a stock, say Digital, requires that its record be deleted in the sequential array (a) or that its pointer entries be set to - 1 or null in the pointer arrays (b) and (c). Either course would leave gaps of unused storage, which is undesirable. Leaving gaps soon uses up all the slots in an array, and unless wasting storage is tolerable and affordable, this is impractical. Instead, the gap would be filled by moving up all the records below a deleted record. But again, this could take O(n2) time for the deletion of n records.

What is needed is a way to store records so that their number can grow or shrink in response to an arbitrary number of insertions and deletions, in such a way that storage is not wasted and the time for insertion and deletion is reduced. The list data structure introduced in this chapter provides one solution to this problem. A list is a collection of records linked by pointers. Even with lists, it is still necessary, as with the sequential array methods, for the program to spend time deciding where to make an insertion or deletion in the first place. To make these operations efficient when large numbers of records are involved requires more advanced data structures (introduced later). For a moderate number of records, the time taken to locate the point of insertion or deletion is tolerable. Lists called stacks and queues, where insertion and deletion occur only at the beginning and end of the lists, are especially useful even when there are many records. These will be considered in the next chapter.

Chapter 2 showed that the key advantage of the sequential array methods is selection in constant time. As just shown in this chapter, the disadvantage is that inserting or deleting entries from the interior of an array is costly. Another drawback is that arrays have fixed lengths that must be declared to the compiler. Thus, the maximum number of items to be stored in the array must be known in advance of execution. No such a priori limit need be declared for the number of items on a list. Lists provide a way to insert new items or to delete existing ones at a known point of the list, in constant time. However, selection of an arbitrary ith list item will take time proportional to i.