Compound Files

This book discusses four options for file I/O. You can read and write whole sequential files (like the MFC archive files you saw first in Chapter 17). You can use a database management system (as described in Chapter 31 and Chapter 32). You can write your own code for random file access. Finally, you can use compound files.

Think of a compound file as a whole file system within a file. Figure 27-1 shows a traditional disk directory as supported by early MS-DOS systems and by Microsoft Windows. This directory is composed of files and subdirectories, with a root directory at the top. Now imagine the same structure inside a single disk file. The files are called streams, and the directories are called storages. Each is identified by a name of up to 32 wide characters in length. A stream is a logically sequential array of bytes, and a storage is a collection of streams and substorages.

Click to view at full size.

Figure 27-1. A disk directory with files and subdirectories.

(A storage can contain other storages, just as a directory can contain subdirectories.) In a disk file, the bytes aren't necessarily stored in contiguous clusters. Similarly, the bytes in a stream aren't necessarily contiguous in their compound file. They just appear that way.

Storage and stream names cannot contain the characters /, \, :, or !. If the first character has an ASCII value of less than 32, the element is marked as managed by some agent other than the owner.

You can probably think of many applications for a compound file. The classic example is a large document composed of chapters and paragraphs within chapters. The document is so large that you don't want to read the whole thing into memory when your program starts, and you want to be able to insert and delete portions of the document. You could design a compound file with a root storage that contains substorages for chapters. The chapter substorages would contain streams for the paragraphs. Other streams could be for index information.

One useful feature of compound files is transactioning. When you start a transaction for a compound file, all changes are written to a temporary file. The changes are made to your file only when you commit the transaction.