PREFACE

This book is about designing practical data structures. The practical orientation of this book is due, in no small part, to the fact that I was first trained as a mechanical engineer, and so my tendency is to design things from an engineering point of view, always balancing the ideals of theory with the reality of practice. The engineer in me has always been eager to understand the trade-offs involved in using one approach over another.

This book is the first part of a planned two-volume set. A subsequent volume, Practical Algorithms in C++, will pick up where this book leaves off, using the data structures designed in this book as tools to be used in many different algorithms. This second volume will be written from the same practical perspective.

In writing these two books, I hope to impart to you some of the interest and knowledge I have gained over the years on the subjects of data structures and algorithms, and to show you how elegantly and efficiently these constructs can be implemented in C++.

HOW THIS BOOK IS DIFFERENT

A quick trip down to your local bookstore will show you that surprisingly few books cover the topic of data structures, let alone show you how to code them in C++. This is surprising because data structures are such a fundamental part of the programming process. In the books that do exist on data structures, the subject matter is covered more from a theoretical viewpoint, with sparse and incomplete coding examples. Many of these books leave a lot of the more interesting and complicated techniques as exercises for the reader.

This book, in contrast, was designed to cover data structures from a practical point of view, with as many complete examples as possible. There are no exercises left for the reader. You won't have to spend days trying to figure out how to accomplish something described in the book. If you're like me, you want to put the techniques to use immediately, to see if they are promising for your own applications. Only then do you want to dissect the code to see what makes it tick.

The data structures shown in this book are the most efficient, elegant, and robust designs that I know of. I didn't just code up the data structures using the first approach that came to mind. Instead, many different designs were implemented and studied to find those that seemed to work the best.

The data structures were crafted specifically for C++. This isn't just a rewrite of a C book with a few programs changed to use classes. With each data structure, I spent much time (literally months) trying to determine how C++ could best be put to use.

Since it's not feasible to show all of the code in the manuscript itself (due to space limitations), only the salient features of each technique are given. The disk included with this book has everything typed in, ready to run. Many of the programs on disk are designed to thoroughly test the data structures presented in this book.

WHO SHOULD READ THIS BOOK

This book is written primarily for intermediate-level C++ programmers. However, the book has been designed to be useful to both those programmers with little C++ experience as well as those at an advanced level. This is accomplished by approaching each subject at a basic level, providing simple techniques, and then progressing to more exotic and advanced techniques. Thus, if you are at the starting point of the C++ learning curve, you can work with the early parts of each chapter. As you become more proficient, you can tackle the more advanced sections. If you're a more experienced programmer, you can use the first parts of each chapter as refresher material, and then quickly move on to the advanced material.

I assume that you know the basics of C++. The focus is on learning how to code data structures, not on learning C++. One exception is the use of templates, a feature relatively new to C++ that allows you to defne parameterized types. I devote the better part of a chapter to templates, since they are used quite heavily throughout the book. I use templates because they allow me to show data structures in a generic way, and they make it easy for you to "plug-in" the specific types of objects you wish to store in the data structures.

WHY C++?

In my early years of programming, I was somewhat frustrated by the lack of a programming language that was both expressive and efficient enough for day-to-day use. The languages at hand were either high level but unfortunately too slow, or fast and regrettably too low level. Then came C++, the language I had been looking for. C++ is both expressive enough to code data structures in an elegant way, yet low level enough to allow efficient programs to be constructed.

Traditionally, most books covering data structures have used Pascal or some other Algol-like language. Pascal has been heavily favored, due to its reputation as a good language for teaching. The problem is that many professional programmers do not use Pascal, but rather C and C++. That's because standard Pascal lacks many features needed for writing real-world applications. Many dialects of Pascal have been invented to make up for these deficiencies, but these dialects are not standard, and you are tied to specific compiler vendors for their implementations.

C, on the other hand, is powerful enough to be used for a wide range of applications, without needing wildly divergent dialects. But because of C's relative terseness and low-level characteristics, many authors have rejected using C as a language for teaching data structures. However, C++ overcomes many of these difficulties, due to its superior type checking and encapsulation facilities. It does this without losing the efficiency of C. Thus, I feel C++ makes an excellent choice for showcasing the most practical data structures in current use.

HOW THIS BOOK IS ORGANIZED

This book includes fourteen chapters, progressing from the basics to more advanced material. The chapters are:

1. Basic Concepts looks at the design of data structures from a very general perspective, showing what all data structures have in common. Also, you learn the ways that you can judge the effectiveness and practicality of one data structure over another.

2. Using C++ Effectively provides a review of the features of C++ as they apply to the design of data structures. The emphasis is on creating safe, robust, and efficient classes.

3. Data Structure Design in C++ discusses the four main approaches to data structures in C++. The difference between concrete data types and abstract data types is explained, along with a thorough discussion of templates, as well as the alternative approach of using single- rooted hierarchies.

4. Basic Array Design takes you step by step through the design of efficient, yet safe arrays. Covered are fixed-length, variable-length, resizable, statically allocated, and dynamically allocated arrays. Techniques for creating multidimensional arrays are also given.

5. Heterogenous Arrays discusses the issues involved in designing arrays to hold different types of data. Integral to this discussion is the use of "smart pointers" to reference-counted objects.

6. Strings combines the techniques explained in Chapters 4 and 5 in designing a safe and efficient string class.

7. Vectors and Matrices shows a relatively new way to design vector and matrix data structures using the notion of vector strides and shared representations of the underlying matrix data. You'll see objects invented by the author, called vector pointers, which allow unique and efficient implementations of sub-matrices and matrix transpositions.

8. Linked Lists takes you step by step through the design of linked lists as high-level objects in their"own right. Contiguous (array-based) and non-contiguous (pointer-based) implementations are given for both singly-linked and doubly-linked lists. You'll see ways to improve the efficiency of allocating and de-allocating nodes used for linked lists.

9. Stacks and Queues takes a detailed look at the design of stacks and queues, two related data structures, with efficiency and flexibility in mind. Both array-based and list-based designs are given.

10. Binary Trees introduces another fundamental data structure, trees, often used in database searching applications. This chapter focuses on binary trees and, in particular, binary search trees. You'll learn how binary trees can be searched and how to efficiently traverse and print trees.

11. Balanced Trees is a continuation of Chapter 10 and describes how to balance search trees to improve their performance. Covered are 2-3-4 trees and their binary analogues, red-black trees.

12. Splay Trees shows an intriguing type of self-organizing binary search tree. These trees have amortized costs that are equivalent to ordinary binary trees, yet splay trees can adapt to changing situations to provide very efficient searching. For instance, you'll see how splay trees can be used to implement priority queues.

13. File-Based Objects shows a unique and never before published technique of caching objects to disk. A new type of pointer, cached-object pointers (Coptrs), is implemented. With Coptrs, you can use cached objects almost like objects residing totally in memory. A file-based object heap allocator is also given, which works in conjunction with the caching.

14. B-trees is the culmination of the book and discusses perhaps the most widely used data structure for database applications, B-trees. An implementation is given for this important data structure that works from disk and uses the file-based caching techniques explained in Chapter 13.

WHAT YOU NEED

Every attempt was made to use generic C++ code in this book, with little that relies on any particular machine or compiler vendor. The code was written to be compliant with the AT&T C++ 3.0 specification. You'll need a compiler that's at least up to the AT&T C++ 2.1 specification and that also supports templates and the user-defined syntax of built-in types. If your compiler does not support templates, it's still possible to use the code in this book, though you'll have to make modifications to remove the template syntax. Chapter 3 discusses how to make such modifications.

The disk that's included with this book is formatted for MS-DOS, and project files are supplied so that you can easily compile the examples using Borland C++ 3.1, currently the only MS-DOS compiler that supports templates. (Many other compiler vendors will no doubt have templates implemented by the time you read this.) The code was extensively tested using Borland C++ 3.1, and was also tested (but not extensively) on Unix using the latest version of Cfront possible.

ACKNOWLEDGMENTS

I would like to thank Holly Mathews for her support and understanding in seeing me through the massive undertaking that was this book. Thanks go as well to my editor Paul Farrell at Wiley for the seemingly infinite amount of patience required to see this book come to fruition. I would also like to thank the production people at the Coriolis Group for doing a fine job on the layout. Special thanks go to technical reviewer Michael Mohle for providing a "sanity check" of the techniques presented, and for his efforts in ensuring that the code runs under Unix. Finally, my apologies to Dog Zilla, for all the walks he missed as I was writing this book.

CONTACTING THE AUTHOR

If you have questions about the contents of this book, you may contact me by mail at the address: Azarona Software, Practical Data Structures in C++, P.O. Box 768, Conifer, CO 80433. I can also be contacted on CompuServe[73057,3172], and on Bix [bflamig].

Bryan Flamig

Conifer, CO

February, 1993

Go to Chapter 1 Return to Table of Contents