13.1: The Need for Storage Management

In languages such as C, arrays are static data structures. They do not change size or shape during the execution of a program. Chains, however, do change size, though they don't change shape. They are dynamic. List-structures, more complex lists, and binary and general trees can change both size and shape and thus are also dynamic data structures. The basic distinction between static and dynamic data structures relates to whether a fixed amount of memory is allocated in advance of program execution. Static structures can have a fixed amount of memory allocated for their sole use prior to program execution. For dynamic structures, the amount of memory required is not known in advance and thus cannot be allocated in this way.

In FORTRAN and COBOL, the total amount of storage needed by the variables of a program is known by the compiler and can be set aside before program execution. This means that a fixed amount of storage is allocated to variables for the entire execution of the program. This is not done in C, which is more versatile. It uses both static and dynamic storage allocation. Some storage is allocated for the entire execution of the program, while other storage is allocated when needed and is reused when no longer needed. This requires storage management during program execution. Dynamic structures such as lists and trees need even more complex storage management during program execution.

Consider a program that maintains a data base of stock portfolios for the clients of a brokerage house. Suppose each client's stocks are kept in a binary search tree ordered by stock name, with each node of the tree represented as a record containing information about the stock. It is not possible to predict in advance how much storage is needed, since how many stocks each client will buy is not known. Experience may show that all clients' stocks will never number more than 2,000, but the distribution of these stocks among the clients varies throughout the year. Thus it will be necessary to have storage available for 2,000 records. At a given moment this storage will be distributed among the clients' binary search trees. Whenever a stock is sold for one client, its storage must be deleted from that client's tree. It may be needed later to store the record of a stock just bought for another client. This allocation and reclaiming of record storage during program execution requires dynamic storage management. Some languages, such as C, provide needed storage management. Others do not, so the programmer must provide it.

This chapter discusses the fundamentals of storage management techniques. It offers insight into how programming languages manage storage and provides a better appreciation of the pitfalls that lie below the surface features of a language. Such insight is especially important when the programmer employs complex data structures that share storage.

You have already seen in Chapter 3 how the simple case of fixed-length records with no sharing might be handled. Problems arise when records have variable lengths and when data structures share memory. The remedies for these problems involve the creation and processing of data structures tailored to these needs. Sophisticated traversal algorithms have been developed to conserve storage. They are introduced here for three reasons. First, they are essential for the implementation of those data structures. Second, they are interesting in themselves. Finally, they serve as a good example of the use of data abstraction.

Although storage management is viewed here from the perspective of a single program's needs, these same techniques are employed by operating systems to deal with the allocation and reclamation of storage for a group of active programs.

In some problems, an upper bound on the problem size is not known in advance. One way to deal with this situation is to pick a reasonably large size and declare static data structures of this size. If the problem's memory requirement exceeds this size, an error message may be printed. When the number of available memory elements is fixed, there will generally be some problem size for which programs fail. However, it is possible that the program uses several data structures. During execution, when one data structure is large the others may be small, so that the total amount of storage required for all the structures may actually be available at all times. With static data structures, where memory is allocated and remains dedicated to a particular structure, storage must be allocated to accommodate the worst case. The worst case calls for the sum of the maximum storages needed for each data structure of the program. If, instead, the available memory can be used for the data structure that currently requires it and reclaimed for later use when that structure no longer needs it, then larger problems can be solved. This point is illustrated in the following examples.

Example 13.1

The lists array used to store successor records for the topological sort of Chapter 11 allows all successor lists to share storage. No storage is dedicated to any individual list. As long as the total number of successors on all lists does not exceed the length of lists, the program will execute correctly. This contrasts with the alternate collection of n arrays discussed in Section 11.6. With the latter approach, storage is dedicated to each list; although twice as many total successors may be accommodated, no list can contain more than its dedicated amount. n

Example 13.2

Suppose five stacks are each implemented, as in Chapter 4, using static array structures, each with a maximum size of 5,000 elements. This requires 25,000 total memory locations. However, if the number of entries in all the stacks at any one time never exceeds 10,000, proper sharing of storage would allow a reduction of up to 15,000 in the amount of storage needed. n

Example 13.3

A program can store, in linked representation, a number of binary trees that must be traversed using stacks. If the trees must be stored and traversed simultaneously, it is possible that available storage for the records of the tree nodes and for the entries of the stacks will not be sufficient for the worst case. However, if the available storage can be shared, so that it is used for tree records or for stack entries when needed, then the program might execute correctly. n

In Example 13.1, the records sharing storage all have the same size. In Example 13.2, the five stacks can all store records of the same size, or they can store records of different lengths. Example 13.3 needs shared storage for records, which may or may not be of the same length for each binary tree, and for stack entries. Normally the record lengths would be significantly greater than that of the stack entries. One fundamental distinction between storage management techniques involves whether or not the storage allocated will all be the same size. When dealing with the fixed-size case, it is convenient to use the techniques for storage allocation and storage reclamation, discussed in Chapter 3, which involve a list of available space. When dealing with more general data structures or with variable-size records, additional complications arise.