1.1: Structure: Data and Program

Two programs doing exactly the same task may look entirely different. One may be highly readable, concise, and easily adjusted to carry out related tasks. The other may be impenetrable, lengthy, and difficult to modify. The same two may differ so much in execution time and storage needs that one program executes while the other aborts, having exceeded time or space restrictions. Experience has shown such differences to be traceable to choice of program structure and data structure.

Programs are written to solve real problems. Program structure breaks the problem and its solution down into simpler, more easily understood parts. The information to be processed is stored in data structures (arrays, records, lists, stacks, trees, and files) provided by the programming language. A data structure groups data. The right data structure for an operation can make it simple and efficient; the wrong one can make the operation cumbersome and inefficient.

Data structures contain information and are operated on during the execution of a program. Programs are said to process information when, in fact, they process data structures. Thus it is not surprising that data and program structure are important and need to be properly related to achieve the goal of successful programming. How this is done is the subject of this book. The story is fascinating and sometimes involved, but it has a happy ending: you master practical programming techniques. These techniques provide the programmer with the means to keep a tight rein on programs so that they remain understandable, maintainable, and efficient, even as they grow in size.

There is structure, then, in both the data and the program itself, and both program and data structure must be appropriate to their tasks. This book presents data structures, not as an isolated theoretical subject, but as an essential object of the problem-solving process leading to the creation of a good program.Three main themes are stressed.

1. Data structures and their impact on programs

2. Programming methodology and its impact on programs

3. Consolidation and extension of knowledge of a high-level language

Data structures are important because the way the programmer chooses to represent data significantly affects the clarity, conciseness, speed of execution, and storage requirements of the program. The text shows how to use particular data structures to create correct and efficient programs. You will see that the operations to be performed on the data determine the best data structures to use.

Developing programs is difficult, especially when done without good guidelines. Creating programs entails solving problems. Top-down programming, another name for structured programming, makes program development easier and leads to well-structured programs. It is the controlled breaking down of a complex task into simpler ones whose individual solutions easily combine to carry out the original task. This process gives structure to the development of programs (hence its name). The top-down, or structured, approach, combined with good programming style, is emphasized in this text. Elements of good programming style include the use of data abstractions and functional modularization, which significantly enhance the readability and changeability of programs.

Data abstraction treats a collection of data by abstracting its important aspects while ignoring as much detail as possible. A data abstraction reduces data to a collection, and to those processing operations directly allowed on that collection. The effect is as though the collection were in an impenetrable black box, the only access to the contents of the box being through the invocation of one or more of the allowed operations. How the collection is actually stored within the box, and how the operations are actually performed, become irrelevant details. Such details determine the efficiency of a program but do not affect its logic.

Functional modularization is the use of a special program unit to carry out a task. In FORTRAN such a unit would be a function or subroutine, in Pascal a function or procedure, in COBOL a paragraph, and in C a function. These program units are used to carry out meaningful processing tasks. C programmers use the word modularization in two ways. One means functional modularization, and the other has to do with packaging a data abstraction using one or more related files. When the first meaning is intended, this book uses functional modularization. The second interpretation appears only in Chapter 5 and will be addressed there.

Practice may not make perfect but seems to be the best way to make problem solvers and programmers out of computer science students. As a student you must repeatedly practice creating programs, because repetition is what facilitates learning. As in learning to dance, draw, or drive, no matter how closely you watch others perform, you can develop skills only by practice.

Finally, whatever high-level programming language you use, studying and understanding the material in this text will enhance your knowledge of that language and your ability to use it effectively. It is assumed that you are familiar with some high-level programming language. Such languages share many features, but each is unique. Although this text occasionally indulges in comparative investigations, all programs herein are written in VAX C, version 2.2, and run on a VAX 11/780 under VMS.

Horowitz [1983] is a collection of reprints of the landmark papers in programming languages. For background in C, see the introductory texts by Kelley and Pohl [1986], Miller and Quilici [1986], and the classic book by Kernighan and Ritchie [1978]. A good reference is Harbison and Steele [1987]. Advanced topics can be found in Berry [1986] and Lapin [1987]. For interesting reading in structured programming, the reader should study the texts by Dahl, Dijkstra, and Hoare [1972] and Wirth [1973].