Chapter 17

Chapter 17. INPUT, OUTPUT, AND FILES

You will learn about the following in this chapter:

An Overview of C++ Input and Output
Output with cout
Input with cin
File Input and Output
Incore Formatting
What Now?
Summary
Review Questions
Programming Exercises

Discussing C++ input and output (I/O, for short) poses a problem. On the one hand, practically every program uses input and output, and learning how to use them is one of the first tasks facing someone learning a computer language. On the other hand, C++ uses many of its more advanced language features to implement input and output, including classes, derived classes, function overloading, virtual functions, templates, and multiple inheritance. Thus, to really understand C++ I/O, you must know a lot of C++. To get you started, the early chapters outlined the basic ways for using the istream class object cin and the ostream class object cout for input and output. Now we'll take a longer look at C++'s input and output classes, seeing how they are designed and learning how to control the output format. (If you've skipped a few chapters just to learn advanced formatting, you can skim the sections on that topic, noting the techniques and ignoring the explanations.)

The C++ facilities for file input and output are based on the same basic class definitions that cin and cout are based on, so this chapter uses the discussion of console I/O (keyboard and screen) as a springboard to investigating file I/O.

The ANSI/ISO C++ standards committee has worked to make C++ I/O more compatible with existing C I/O, and this has produced some changes from traditional C++ practices.

An Overview of C++ Input and Output

Most computer languages build input and output into the language itself. For example, if you look through the lists of keywords for languages like BASIC or Pascal, you'll see that PRINT statements, writeln statements, and the like, are part of the language vocabulary. But neither C nor C++ have built input and output into the language. If you look through the keywords for these languages, you find for and if, but nothing relating to I/O. C originally left I/O to compiler implementers. One reason for this was to give implementers the freedom to design I/O functions that best fit the hardware requirements of the target computer. In practice, most implementers based I/O on a set of library functions originally developed for the UNIX environment. ANSI C formalized recognition of this I/O package, called the Standard Input/Output package, by making it a mandatory component of the standard C library. C++ also recognizes this package, so if you're familiar with the family of C functions declared in the stdio.h file, you can use them in C++ programs. (Newer implementations use the cstdio header file to support these functions.)

C++, however, relies upon a C++ solution rather than a C solution to I/O, and that solution is a set of classes defined in the iostream (formerly iostream.h) and fstream (formerly fstream.h) header files. This class library is not part of the formal language definition (cin and istream are not keywords); after all, a computer language defines rules for how to do things, such as create classes, and doesn't define what you should create following those rules. But, just as C implementations come with a standard library of functions, C++ comes with a standard library of classes. At first, that standard class library was an informal standard consisting solely of the classes defined in the iostream and fstream header files. The ANSI/ISO C++ committee decided to formalize this library as a standard class library and to add a few more standard classes, such as those discussed in Chapter 16, "The string Class and the Standard Template Library." This chapter discusses standard C++ I/O. But first, let's examine the conceptual framework for C++ I/O.

Streams and Buffers

A C++ program views input or output as a stream of bytes. On input, a program extracts bytes from an input stream, and on output, a program inserts bytes into the output stream. For a text-oriented program, each byte can represent a character. More generally, the bytes can form a binary representation of character or numeric data. The bytes in an input stream can come from the keyboard, but they also can come from a storage device, such as a hard disk, or from another program. Similarly, the bytes in an output stream can flow to the screen, to a printer, to a storage device, or to another program. A stream acts as an intermediary between the program and the stream's source or destination. This approach enables a C++ program to treat input from a keyboard in the same manner it treats input from a file; the C++ program merely examines the stream of bytes without needing to know from where the bytes come. Similarly, by using streams, a C++ program can process output in a manner independent of where the bytes are going. Managing input, then, involves two stages:

Associating a stream with an input to a program
Connecting the stream to a file

In other words, an input stream needs two connections, one at each end. The file-end connection provides a source for the stream, and the program-end connection dumps the stream outflow into the program. (The file-end connection can be a file, but it also can be a device, such as a keyboard.) Similarly, managing output involves connecting an output stream to the program and associating some output destination with the stream. It's like plumbing with bytes instead of water (see Figure 17.1).

Figure 17.1. C++ input and output.

graphics/17fig01.gif

Usually, input and output can be handled more efficiently by using a buffer. A buffer is a block of memory used as an intermediate, temporary storage facility for the transfer of information from a device to a program or from a program to a device. Typically, devices like disk drives transfer information in blocks of 512 bytes or more, while programs often process information one byte at a time. The buffer helps match these two disparate rates of information transfer. For example, assume a program is supposed to count the number of dollar signs in a hard-disk file. The program could read one character from the file, process it, read the next character from the file, and so on. Reading a file a character at a time from a disk requires a lot of hardware activity and is slow. The buffered approach is to read a large chunk from the disk, store the chunk in the buffer, and read the buffer one character at a time. Because it is much quicker to read individual bytes of data from memory than from a hard disk, this approach is much faster as well as easier on the hardware. Of course, after the program reaches the end of the buffer, the program then should read another chunk of data from the disk. The principle is similar to that of a water reservoir that collects megagallons of runoff water during a big storm, then feeds water to your home at a more civilized rate of flow (see Figure 17.2). Similarly, on output a program can first fill the buffer, then transfer the entire block of data to a hard disk, clearing the buffer for the next batch of output. This is called flushing the buffer. Perhaps you can come up with your own plumbing-based analogy for that process.

Figure 17.2. A stream with a buffer.

graphics/17fig02.gif

Keyboard input provides one character at a time, so in that case a program doesn't need a buffer to help match different data transfer rates. However, buffered keyboard input allows the user to back up and correct input before transmitting it to a program. A C++ program normally flushes the input buffer when you press <Enter>. That's why the examples in this book don't begin processing input until you press <Enter>. For output to the screen, a C++ program normally flushes the output buffer when you transmit a newline character. Depending upon the implementation, a program may flush input on other occasions, too, such as impending input. That is, when a program reaches an input statement, it flushes any output currently in the output buffer. C++ implementations that are consistent with ANSI C should behave in that manner.

Streams, Buffers, and the `iostream` File

The business of managing streams and buffers can get a bit complicated, but including the iostream (formerly iostream.h) file brings in several classes designed to implement and manage streams and buffers for you. The newest version of C++ I/O actually defines class templates in order to support both char and wchar_t data. By using the typedef facility, C++ makes the char specializations of these templates mimic the traditional non-template I/O implementation. Here are some of those classes (see Figure 17.3):

The streambuf class provides memory for a buffer along with class methods for filling the buffer, accessing buffer contents, flushing the buffer, and managing the buffer memory.
The ios_base class represents general properties of a stream, such as whether it's open for reading and whether it's a binary or a text stream.
The ios class is based on ios_base, and it includes a pointer member to a streambuf object.
The ostream class derives from the ios class and provides output methods.
The istream class also derives from the ios class and provides input methods.
The iostream class is based on the istream and ostream classes and thus inherits both input and output methods.

Figure 17.3. Some I/O classes.

graphics/17fig03.gif

To use these facilities, you use objects of the appropriate classes. For example, use an ostream object such as cout to handle output. Creating such an object opens a stream, automatically creates a buffer, and associates it with the stream. It also makes the class member functions available to you.

Redefining I/O

graphics/common.gif

The ISO/ANSI C++ standard has revised I/O a couple of ways. First, there's the change from ostream.h to ostream, with ostream placing the classes in the std namespace. Second, the I/O classes have been rewritten. To be an international language, C++ had to be able to handle international character sets that require a 16-bit or wider character type. So the language added the wchar_t (or "wide") character type to the traditional 8-bit char (or "narrow") type. Each type needs its own I/O facilities. Rather than develop two separate sets of classes, the standards committee developed a template set of I/O classes, including basic_istream<charT, traits<charT> > and basic_ostream<charT, traits<charT> >. The traits<charT> template, in turn, is a template class defining particular traits for a character type, such as how to compare for equality and its EOF value. The standard provides char and wchar_t specializations of the I/O classes. For example, istream and ostream are typedefs for char specializations. Similarly, wistream and wostream are wchar_t specializations. For example, there is a wcout object for outputting wide character streams. The ostream header file contains these definitions.

Certain type-independent information that used to be kept in the ios base class has been moved to the new ios_base class. This includes the various formatting constants such as ios::fixed, which now is ios_base::fixed. Also, ios_base contains some options that weren't available in the old ios.

In some cases, the change in the filename corresponds with the change in class definitions. In Microsoft Visual C++ 6.0, for example, you can include iostream.h and get the old class definitions or include iostream and get the new class definitions. However, dual versions like this are not the general rule.

The C++ iostream class library takes care of many details for you. For example, including the iostream file in a program creates eight stream objects (four for narrow characters stream and four for wide character streams) automatically:

The cin object corresponds to the standard input stream. By default, this stream is associated with the standard input device, typically a keyboard. The wcin object is similar, but works with the wchar_t type.
The cout object corresponds to the standard output stream. By default, this stream is associated with the standard output device, typically a monitor. The wcout object is similar, but works with the wchar_t type.
The cerr object corresponds to the standard error stream, which you can use for displaying error messages. By default, this stream is associated with the standard output device, typically a monitor, and the stream is unbuffered. This means that information is sent directly to the screen without waiting for a buffer to fill or for a newline character. The wcerr object is similar, but works with the wchar_t type.
The clog object also corresponds to the standard error stream. By default, this stream is associated with the standard output device, typically a monitor, and the stream is buffered. The wclog object is similar, but works with the wchar_t type.

What does it mean to say an object represents a stream? Well, for example, when the iostream file declares a cout object for your program, that object will have data members holding information relating to output, such as the field widths to be used in displaying data, the number of places after the decimal to use, what number base to use for displaying integers, and the address of a streambuf object describing the buffer used to handle the output flow. A statement such as

cout << "Bjarne free";

places the characters from the string "Bjarne free" into the buffer managed by cout via the pointed-to streambuf object. The ostream class defines the operator<<() function used in this statement, and the ostream class also supports the cout data members with a variety of other class methods, such as the ones this chapter discusses later. Furthermore, C++ sees to it that the output from the buffer is directed to the standard output, usually a monitor, provided by the operating system. In short, one end of a stream is connected to your program, the other end is connected to the standard output, and the cout object, with the help of a type streambuf object, manages the flow of bytes through the stream.

Redirection

The standard input and output streams normally connect to the keyboard and the screen. But many operating systems, including Unix, Linux, and MS-DOS, support redirection, a facility that lets you change the associations for the standard input and the standard output. Suppose, for example, you have an executable DOS C++ program called counter.exe that counts the number of characters in its input and reports the result. (From various versions of Windows you can go to Start, click Programs, then click the MS-DOS Command Prompt icon or Command Prompt icon to start an MD-DOS window.) A sample run might look like this:

C>counter
Hello
and goodbye!
Control-Z          ?simulated end-of-file
Input contained 19 characters.
C>

Here, input came from the keyboard, and output went to the screen.

With input redirection (<) and output redirection (>), you can use the same program to count the number of characters in the oklahoma file and to place the results in the cow_cnt file:

C>counter <oklahoma >cow_cnt
C>

The <oklahoma part of the command line associates the standard input with the oklahoma file, causing cin to read input from that file instead of the keyboard. In other words, the operating system changes the connection at the inflow end of the input stream, while the outflow end remains connected to the program. The >cow_cnt part of the command line associates the standard output with the cow_cnt file, causing cout to send output to that file instead of to the screen. That is, the operating system changes the outflow end connection of the output stream, leaving its inflow end still connected to the program. DOS (2.0 and later), Linux, and Unix automatically recognize this redirection syntax. (Unix, Linux, and DOS 3.0 and later also permit optional space characters between the redirection operators and the filenames.)

The standard output stream, represented by cout, is the normal channel for program output. The standard error streams (represented by cerr and clog) are intended for a program's error messages. By default, all three typically are sent to the monitor. But redirecting the standard output doesn't affect cerr or clog; thus, if you use one of these objects to print an error message, a program will display the error message on the screen even if the regular cout output is redirected elsewhere. For example, consider this code fragment:

if (success)
    cout << "Here come the goodies!\n";
else
{
    cerr << "Something horrible has happened.\n";
    exit(1);
}

If redirection is not in effect, whichever message is selected is displayed onscreen. If, however, the program output has been redirected to a file, the first message, if selected, would go to the file but the second message, if selected, would go to the screen. By the way, some operating systems permit redirecting the standard error, too. In Unix and Linux, for example, the 2> operator redirects the standard error.

Output with `cout`

C++, we've said, considers output to be a stream of bytes. (Depending on the implementation and platform, these may be 16-bit or 32-bit bytes, but bytes nonetheless.) But many kinds of data in a program are organized into larger units than a single byte. An int type, for example, may be represented by a 16-bit or 32-bit binary value. And a double value may be represented by 64 bits of binary data. But when you send a stream of bytes to a screen, you want each byte to represent a character value. That is, to display the number -2.34 on the screen, you should send the five characters -, 2, ., 3, and 4 to the screen, and not the internal 64-bit floating-point representation of that value. Therefore, one of the most important tasks facing the ostream class is converting numeric types, such as int or float, into a stream of characters that represents the values in text form. That is, the ostream class translates the internal representation of data as binary bit patterns to an output stream of character bytes. (Some day we may have bionic implants to enable us to interpret binary data directly. We leave that development as an exercise for the reader.) To perform these translation tasks, the ostream class provides several class methods. We'll look at them now, summarizing methods used throughout the book and describing additional methods that provide a finer control over the appearance of the output.

The Overloaded `<<` Operator

Most often, this book has used cout with the << operator, also called the insertion operator:

int clients = 22;
cout << clients;

In C++, as in C, the default meaning for the << operator is the bitwise left-shift operator (see Appendix E, "Other Operators"). An expression such as x<<3 means to take the binary representation of x and shift all the bits 3 units to the left. Obviously, this doesn't have a lot to do with output. But the ostream class redefines the << operator through overloading to output for the ostream class. In this guise, the << operator is called the insertion operator instead of the left-shift operator. (The left-shift operator earned this new role through its visual aspect, which suggests a flow of information to the left.) The insertion operator is overloaded to recognize all the basic C++ types:

unsigned char
signed char
char
short
unsigned short
int
unsigned int
long
unsigned long
float
double
long double

The ostream class provides a definition for the operator<<() function for each of the previous types. (Functions incorporating operator into the name are used to overload operators, as discussed in Chapter 11, "Working with Classes.") Thus, if you use a statement of the form

cout << value;

and if value is one of the preceding types, a C++ program can match it to an operator function with the corresponding signature. For example, the expression cout << 88 matches the following method prototype:

ostream & operator<<(int);

Recall that this prototype indicates that the operator<<() function takes one type int argu ment. That's the part that matches the 88 in the previous statement. The prototype also indicates that the function returns a reference to an ostream object. That property makes it possible to concatenate output, as in the following old rock hit:

cout << "I'm feeling sedimental over " << boundary << "\n";

If you're a C programmer who has suffered through C's multitudinous % type specifiers and the problems that arise when you mismatch a specifier type to a value, using cout is almost sinfully easy. (And C++ input, of course, is cinfully easy.)

Output and Pointers

The ostream class also defines insertion operator functions for the following pointer types:

const signed char *
const unsigned char *
const char *
void *

C++ represents a string, don't forget, by using a pointer to the location of the string. The pointer can take the form of the name of an array of char or of an explicit pointer-to-char or of a quoted string. Thus, all of the following cout statements display strings:

char name[20] = "Dudly Diddlemore";
char * pn = "Violet D'Amore";
cout << "Hello!";
cout << name;
cout << pn;

The methods use the terminating null character in the string to determine when to stop displaying characters.

C++ matches a pointer of any other type with type void * and prints a numerical representation of the address. If you want the address of the string, you have to type cast it to another type.

int eggs = 12;
char * amount = "dozen";
cout << &eggs;              // prints address of eggs variable
cout << amount;             // prints the string "dozen"
cout << (void *) amount;    // prints the address of the "dozen" string

Note

graphics/common.gif

Not all current C++ implementations have a prototype with the void * argument. In that case, you have to type cast a pointer to unsigned or, perhaps, unsigned long, if you want to print the value of the address.

Output Concatenation

All the incarnations of the insertion operator are defined to return type ostream &. That is, the prototypes have this form:

ostream & operator<<(type);

(Here, type is the type to be displayed.) The ostream & return type means that using this operator returns a reference to an ostream object. Which object? The function definitions say that the reference is to the object used to evoke the operator. In other words, an operator function's return value is the same object that evokes the operator. For example, cout << "potluck" returns the cout object. That's the feature that lets you concatenate output using insertion. For example, consider the following statement:

cout << "We have " << count << " unhatched chickens.\n";

The expression cout << "We have " displays the string and returns the cout object, reducing the statement to the following:

cout << count << " unhatched chickens.\n";

Then the expression cout << count displays the value of the count variable and returns cout, which then can handle the final argument in the statement (see Figure 17.4). This design technique really is a nice feature, which is why our examples of overloading the << operator in the previous chapters shamelessly imitate it.

Figure 17.4. Output concatenation.

graphics/17fig04.gif

The Other `ostream` Methods

Besides the various operator<<() functions, the ostream class provides the put() method for displaying characters and the write() method for displaying strings.

Some compilers don't implement the put() method correctly. Traditionally, it had the following prototype:

ostream & put(char);

The current standard is equivalent, except it's templated to allow for wchar_t. You invoke it using the usual class method notation:

cout.put('W');      // display the W character

Here cout is the invoking object and put() is the class member function. Like the << operator functions, this function returns a reference to the invoking object, so you can concatenate output with it:

cout.put('I').put('t'); // displaying It with two put() calls

The function call cout.put('I') returns cout, which then acts as the invoking object for the put('t') call.

Given the proper prototype, you can use put() with arguments of numeric types other than char, such as int, and let function prototyping automatically convert the argument to the correct type char value. For example, you could do the following:

cout.put(65);        // display the A character
cout.put(66.3);      // display the B character

The first statement converted the int value 65 to a char value and then displayed the character having 65 as its ASCII code. Similarly, the second statement converted the type double value 66.3 to a type char value 66 and displayed the corresponding character.

This behavior came in handy prior to Release 2.0 C++; at that time the language represented character constants with type int values. Thus, a statement such as

cout << 'W';

would have interpreted 'W' as an int value, and hence displayed it as the integer 87, the ASCII value for the character. But the statement

cout.put('W');

worked fine. Because current C++ represents char constants as type char, you now can use either method.

The implementation problem is that some compilers overload put() for three argument types: char, unsigned char, and signed char. This makes using put() with an int argument ambiguous, for an int could be converted to any one of those three types.

The write() method writes an entire string and has the following template prototype:

basic_ostream<charT,traits>& write(const char_type* s, streamsize n);

The first argument to write() provides the address of the string to be displayed, and the second argument indicates how many characters to display. Using cout to invoke write() invokes the char specialization, so the return type is ostream &. Listing 17.1 shows how the write() method works.

Listing 17.1 `write.cpp`

// write.cpp -- use cout.write()
#include <iostream>
using namespace std;
#include <cstring>  // or else string.h

int main()
{
    const char * state1 = "Florida";
    const char * state2 = "Kansas";
    const char * state3 = "Euphoria";
    int len = strlen(state2);
    cout << "Increasing loop index:\n";
    int i;
    for (i = 1; i <= len; i++)
    {
        cout.write(state2,i);
        cout << "\n";
    }

// concatenate output
    cout << "Decreasing loop index:\n";
    for (i = len; i > 0; i--)
        cout.write(state2,i) << "\n";

// exceed string length
    cout << "Exceeding string length:\n";
    cout.write(state2, len + 5) << "\n";

    return 0;
}

Here is the output:

Increasing loop index:
K
Ka
Kan
Kans
Kansa
Kansas
Decreasing loop index:
Kansas
Kansa
Kans
Kan
Ka
K
Exceeding string length:
Kansas Euph

Note that the cout.write() call returns the cout object. This is because the write() method returns a reference to the object that invokes it, and in this case, the cout object invokes it. This makes it possible to concatenate output, for cout.write() is replaced by its return value, cout:

cout.write(state2,i) << "\n";

Also, note that the write() method doesn't stop printing characters automatically when it reaches the null character. It simply prints how many characters you tell it to, even if that goes beyond the bounds of a particular string! In this case, the program brackets the string "Kansas" with two other strings so that adjacent memory locations would contain data. Compilers differ in the order in which they store data in memory and in how they align memory. For example, "Kansas" occupies six bytes, but this particular compiler appears to align strings using multiples of four bytes, so "Kansas" is padded out to eight bytes. Because of compiler differences, you may get a different result for the final line of output.

The write() method can also be used with numeric data. It doesn't translate a number to the correct characters; instead, it transmits the bit representation as stored in memory. For example, a 4-byte long value such as 560031841 would be transmitted as four separate bytes. An output device such as a monitor would then try to interpret each byte as if it were ASCII (or whatever) code. So 560031841 would appear onscreen as some 4-character combination, most likely gibberish. (But maybe not; try it, and see.) However, write() does provide a compact, accurate way to store numeric data in a file. We'll return to this possibility later in this chapter.

Flushing the Output Buffer

Consider what happens as a program uses cout to send bytes on to the standard output. Because the ostream class buffers output handled by the cout object, output isn't sent to its destination immediately. Instead, it accumulates in the buffer until the buffer is full. Then the program flushes the buffer, sending the contents on and clearing the buffer for new data.Typically, a buffer is 512 bytes or an integral multiple thereof. Buffering is a great time-saver when the standard output is connected to a file on a hard disk. After all, you don't want a program to access the hard disk 512 times to send 512 bytes. It's much more effective to collect 512 bytes in a buffer and write them to a hard disk in a single disk operation.

For screen output, however, filling the buffer first is less critical. Indeed, it would be inconvenient if you had to reword the message "Press any key to continue" so that it consumed the prerequisite 512 bytes to fill a buffer. Fortunately, in the case of screen output, the program doesn't necessarily wait until the buffer is full. Sending a newline character to the buffer, for example, normally flushes the buffer. Also, as mentioned before, most implementations flush the buffer when input is pending. That is, suppose you have the following code:

cout << "Enter a number: ";
float num;
cin >> num;

The fact that the program expects input causes it to display the cout message (that is, flush the "Enter a number: " message) immediately, even though the output string lacks a newline. Without this feature, the program would wait for input without having prompted the user with the cout message.

If your implementation doesn't flush output when you want it to, you can force flushing by using one of two manipulators. The flush manipulator flushes the buffer, and the endl manipulator flushes the buffer and inserts a newline. You use these manipulators the way you would use a variable name:

cout << "Hello, good-looking! " << flush;
cout << "Wait just a moment, please." << endl;

Manipulators are, in fact, functions. For example, you can flush the cout buffer by calling the flush() function directly:

flush(cout);

However, the ostream class overloads the << insertion operator in such a way that the expression

cout << flush

gets replaced with the flush(cout) function call. Thus, you can use the more convenient insertion notation to flush with success.

Formatting with `cout`

The ostream insertion operators convert values to text form. By default, they format values as follows:

A type char value, if it represents a printable character, is displayed as a character in a field one character wide.
Numerical integer types are displayed as decimal integers in a field just wide enough to hold the number and, if present, a minus sign.
Strings are displayed in a field equal in width to the length of the string.

The default behavior for floating-point has changed. The following list details the differences between older and newer implementations:

(New Style) Floating-point types are displayed with a total of six digits, except that trailing zeros aren't displayed. (Note that the number of digits displayed has no connection with the precision to which the number is stored.) The number is displayed in fixed-point notation or else in E notation (see Chapter 3, "Dealing with Data"), depending upon the value of the number. In particular, E notation is used if the exponent is 6 or larger or -5 or smaller. Again, the field is just wide enough to hold the number and, if present, a minus sign. The default behavior corresponds to using the standard C library function fprintf() with a %g specifier.
(Old Style) Floating-point types are displayed with six places to the right of the decimal, except that trailing zeros aren't displayed. (Note that the number of digits displayed has no connection with the precision to which the number is stored.) The number is displayed in fixed-point notation or else in E notation (see Chapter 3), depending upon the value of the number. Again, the field is just wide enough to hold the number and, if present, a minus sign.

Because each value is displayed in a width equal to its size, you have to provide spaces between values explicitly; otherwise, consecutive values would run together.

There are several small differences between early C++ formatting and the current standard; we'll summarize them in Table 17.3 later in this chapter.

Listing 17.2 illustrates the output defaults. It displays a colon (:) after each value so you can see the width field used in each case. The program uses the expression 1.0 / 9.0 to generate a nonterminating fraction so you can see how many places get printed.

Compatibility Note

graphics/hands.gif

Not all compilers generate output formatted in accordance with the current standard. Also, the current standard allows for regional variations. For example, a European implementation can follow the continental fashion of using a comma instead of a period for displacing decimal fractions. That is, it may write 2,54 instead of 2.54. The locale library (header file locale) provides a mechanism for imbuing an input or output stream with a particular style, so a single compiler can offer more than one locale choice. This chapter will use the U.S. locale.

Listing 17.2 `defaults.cpp`

// defaults.cpp -- cout default formats
#include <iostream>
using namespace std;

int main()
{
    cout << "12345678901234567890\n";
    char ch = 'K';
    int t = 273;
    cout << ch << ":\n";
    cout << t << ":\n";
    cout << -t <<":\n";

    double f1 = 1.200;
    cout << f1 << ":\n";
    cout << (f1 + 1.0 / 9.0) << ":\n";

    double f2 = 1.67E2;
    cout << f2 << ":\n";
    f2 += 1.0 / 9.0;
    cout << f2 << ":\n";
    cout << (f2 * 1.0e4) << ":\n";
    double f3 = 2.3e-4;
    cout << f3 << ":\n";
    cout << f3 / 10 << ":\n";

    return 0;
}

Here is the output:

12345678901234567890
K:
273:
-273:
1.2:
1.31111:
167:
167.111:
1.67111e+006:
0.00023:
2.3e-005:

Each value fills its field. Note that the trailing zeros of 1.200 are not displayed but that floating-point values without terminating zeros have six places to the right of the decimal displayed. Also, this particular implementation displays three digits in the exponent; others might use two.

Changing the Number Base Used for Display

The ostream class inherits from the ios class, which inherits from the ios_base class. The ios_base class stores information describing the format state. For example, certain bits in one class member determine the number base used, while another member determines the field width. By using manipulators, you can control the number base used to display integers. By using ios_base member functions, you can control the field width and the number of places displayed to the right of the decimal. Because the ios_base class is an indirect base class for ostream, you can use its methods with ostream objects (or descendants), such as cout.

Note

graphics/common.gif

The members and methods found in the ios_base class formerly were found in the ios class. Now ios_base is a base class to ios. In the new system, ios is a template class with char and wchar_t specializations, while ios_base contains the non-template features.

Let's see how to set the number base to be used in displaying integers. To control whether integers are displayed in base 10, base 16, or base 8, you can use the dec, hex, and oct manipulators. For example, the function call

hex(cout);

sets the number base format state for the cout object to hexadecimal. Once you do this, a program will print integer values in hexadecimal form until you set the format state to another choice. Note that the manipulators are not member functions, hence they don't have to be invoked by an object.

Although the manipulators really are functions, you normally see them used this way:

cout << hex;

The ostream class overloads the << operator to make this usage equivalent to the function call hex(cout). Listing 17.3 illustrates using these manipulators. It shows the value of an integer and its square in three different number bases. Note that you can use a manipulator separately or as part of a series of insertions.

Listing 17.3 `manip.cpp`

// manip.cpp -- using format manipulators
#include <iostream>
using namespace std;
int main()
{
    cout << "Enter an integer: ";
    int n;
    cin >> n;

    cout << "n     n*n\n";
    cout << n << "     " << n * n << " (decimal)\n";
// set to hex mode
    cout << hex;
    cout << n << "     ";
    cout << n * n << " (hexadecimal)\n";

// set to octal mode
    cout << oct << n << "     " << n * n << " (octal)\n";

// alternative way to call a manipulator
    dec(cout);
    cout << n << "     " << n * n << " (decimal)\n";

    return 0;
}

Here is some sample output:

Enter an integer: 13
n     n*n
13     169 (decimal)
d     a9 (hexadecimal)
15     251 (octal)
13     169 (decimal)

Adjusting Field Widths

You probably noticed that the columns in the preceding example don't line up; that's because the numbers have different field widths. You can use the width member function to place differently sized numbers in fields having equal widths. The method has these prototypes:

int width();
int width(int i);

The first form returns the current setting for field width. The second sets the field width to i spaces and returns the previous field width value. This allows you to save the previous value in case you want to restore the width to that value later.

The width() method affects only the next item displayed, and the field width reverts to the default value afterwards. For example, consider the following statements:

cout << '#';
cout.width(12);
cout << 12 << "#" <<  24 << "#\n";

Because width() is a member function, you have to use an object (cout, in this case) to invoke it. The output statement produces the following display:

#          12#24#

The 12 is placed in a field 12 characters wide at the right end of the field. This is called right-justification. After that, the field width reverts to the default, and the two # characters and the 24 are printed in fields equal to their own size.

Remember

The width() method affects only the next item displayed, and the field width reverts to the default value afterwards.

C++ never truncates data, so if you attempt to print a seven-digit value in a field width of 2, C++ expands the field to fit the data. (Some languages just fill the field with asterisks if the data doesn't fit. The C/C++ philosophy is that showing all the data is more important than keeping the columns neat; C++ puts substance before form.) Listing 17.4 shows how the width() member function works.

Listing 17.4 `width.cpp`

// width.cpp -- use the width method
#include <iostream>
using namespace std;

int main()
{
    int w = cout.width(30);
    cout << "default field width = " << w << ":\n";
    cout.width(5);
    cout << "N" <<':';
    cout.width(8);
    cout << "N * N" << ":\n";

    for (long i = 1; i <= 100; i *= 10)
    {
        cout.width(5);
        cout << i <<':';
        cout.width(8);
        cout << i * i << ":\n";
    }

    return 0;
}

Here is the output:

      default field width = 0:
  N:   N * N:
  1:       1:
 10:     100:
100:   10000:

The output displays values right-justified in their fields. The output is padded with spaces. That is, cout achieves the full field width by adding spaces. With right-justification, the spaces are inserted to the left of the values. The character used for padding is termed the fill character. Right-justification is the default.

Note that the program applies the field width of 30 to the string displayed by the first cout statement but not to the value of w. This is because the width() method affects only the next single item displayed. Also, note that w has the value 0. This is because cout.width(30) returns the previous field width, not the one to which it was just set. The fact that w is zero means that zero is the default field width. Because C++ always expands a field to fit the data, this one size fits all. Finally, the program uses width() to align column headings and data by using a width of five characters for the first column and a width of eight characters for the second column.

Fill Characters

By default, cout fills unused parts of a field with spaces. You can use the fill() member function to change that. For example, the call

cout.fill('*');

changes the fill character to an asterisk. That can be handy for, say, printing checks so that recipients can't easily add a digit or two. Listing 17.5 illustrates using this member function.

Listing 17.5 `fill.cpp`

// fill.cpp -- change fill character for fields
#include <iostream>
using namespace std;

int main()
{
    cout.fill('*');
    char * staff[2] = { "Waldo Whipsnade", "Wilmarie Wooper"};
    long bonus[2] = {900, 1350};

    for (int i = 0; i < 2; i++)
    {
        cout << staff[i] << ": $";
        cout.width(7);
        cout << bonus[i] << "\n";
    }

    return 0;
}

Here's the output:

Waldo Whipsnade: $****900
Wilmarie Wooper: $***1350

Note that, unlike the field width, the new fill character stays in effect until you change it.

Setting Floating-Point Display Precision

The meaning of floating-point precision depends upon the output mode. In the default mode, it means the total number of digits displayed. In the fixed and scientific modes, to be discussed soon, the precision means the number of digits displayed to the right of the decimal place. The precision default for C++, as you've seen, is 6. (Recall, however, that trailing zeros are dropped.) The precision() member function lets you select other values. For example, the statement

cout.precision(2);

causes cout to set the precision to 2. Unlike the case with width(), but like the case for fill(), a new precision setting stays in effect until reset. Listing 17.6 demonstrates precisely this point.

Listing 17.6 `precise.cpp`

// precise.cpp -- set the precision
#include <iostream>
using namespace std;

int main()
{
    float price1 = 20.40;
    float price2 = 1.9 + 8.0 / 9.0;
    cout << "\"Furry Friends\" is $" << price1 << "!\n";
    cout << "\"Fiery Fiends\" is $" << price2 << "!\n";

    cout.precision(2);
    cout << "\"Furry Friends\" is $" << price1 << "!\n";
    cout << "\"Fiery Fiends\" is $" << price2 << "!\n";

    return 0;
}

Compatibility Note

graphics/hands.gif

Older versions of C++ interpret the precision for the default mode as the number of digits to the right of the decimal instead of as the total number of digits.

Here is the output:

"Furry Friends" is $20.4!
"Fiery Fiends" is $2.78889!
"Furry Friends" is $20!
"Fiery Fiends" is $2.8!

Note that the third line doesn't print a trailing decimal point. Also, the fourth line displays a total of two digits.

Printing Trailing Zeros and Decimal Points

Certain forms of output, such as prices or numbers in columns, look better if trailing zeros are retained. For example, the output to Listing 17.6 would look better as $20.40 than as $20.4. The iostream family of classes doesn't provide a function whose sole purpose is accomplishing that. However, the ios_base class provides a setf() (for set flag) function that controls several formatting features. The class also defines several constants that can be used as arguments to this function. For example, the function call

cout.setf(ios_base::showpoint);

causes cout to display trailing decimal points. Formerly, but not currently, it also causes trailing zeros to be displayed. That is, instead of displaying 2.00 as 2, cout will display it as 2.000000 (old C++ formatting) or 2. (current formatting) if the default precision of 6 is in effect. Listing 17.7 adds this statement to Listing 17.6.

Caution

graphics/tnt.gif

If your compiler uses the iostream.h header file instead of iostream, you most likely will have to use ios instead of ios_base in setf() arguments.

In case you're wondering about the notation ios_base::showpoint, showpoint is a class scope static constant defined in the ios_base class declaration. Class scope means that you have to use the scope operator (::) with the constant name if you use the name outside a member function definition. So ios_base::showpoint names a constant defined in the ios_base class.

Listing 17.7 `showpt.cpp`

// showpt.cpp -- set the precision, show trailing point
#include <iostream>
using namespace std;

int main()
{
    float price1 = 20.40;
    float price2 = 1.9 + 8.0 / 9.0;

    cout.setf(ios_base::showpoint);
    cout << "\"Furry Friends\" is $" << price1 << "!\n";
    cout << "\"Fiery Fiends\" is $" << price2 << "!\n";

    cout.precision(2);
    cout << "\"Furry Friends\" is $" << price1 << "!\n";
    cout << "\"Fiery Fiends\" is $" << price2 << "!\n";

    return 0;
}

Here is the output using the current formatting. Note that trailing zeros are not shown, but the trailing decimal point for the third line is shown.

"Furry Friends" is $20.4!
"Fiery Fiends" is $2.78889!
"Furry Friends" is $20.!
"Fiery Fiends" is $2.8!

How, then, can you display trailing zeros? To answer that question, we have to discuss the setf() function in more detail.

More About `setf()`

The setf() method controls several other formatting choices, so let's take a closer look at it. The ios_base class has a protected data member in which individual bits (called flags in this context) control different formatting aspects such as the number base or whether trailing zeros are displayed. Turning a flag on is called setting the flag (or bit) and means setting the bit to 1. (If you've ever had to set DIP switches to configure computer hardware, bit flags are the programming equivalent.) The hex, dec, and oct manipulators, for example, adjust the three flag bits that control the number base. The setf() function provides another means of adjusting flag bits.

The setf() function has two prototypes. The first is this:

fmtflags setf(fmtflags);

Here fmtflags is a typedef name for a bitmask type (see Note) used to hold the format flags. The name is defined in the ios_base class. This version of setf() is used for setting format information controlled by a single bit. The argument is a fmtflags value indicating which bit to set. The return value is a type fmtflags number indicating the former setting of all the flags. You then can save that value if you later want to restore the original settings. What value do you pass to setf()? If you want to set bit number 11 to 1, you pass a number having its number 11 bit set to 1. The return value would have its number 11 bit assigned the prior value for that bit. Keeping track of bits sounds (and is) tedious. However, you don't have to do that job; the ios_base class defines constants representing the bit values. Table 17.1 shows some of these definitions.

Note

graphics/common.gif

A bitmask type is a type used to store individual bit values. It could be an integer type, an enum, or an STL bitset container. The main idea is that each bit is individually accessible and has its own meaning. The iostream package uses bitmask types to store state information.

Table 17.1. Formatting Constants
Constant	Meaning
`ios_base::boolalpha`	Input and output `bool` values as `true` and `false`
`ios_base::showbase`	Use C++ base prefixes (0,0x) on output
`ios_base::showpoint`	Show trailing decimal point
`ios_base::uppercase`	Use uppercase letters for hex output, E notation
`ios_base::showpos`	Use + before positive numbers

Because these formatting constants are defined within the ios_base class, you must use the scope resolution operator with them. That is, use ios_base::uppercase, not just uppercase. Changes remain in effect until overridden. Listing 17.8 illustrates using some of these constants.

Listing 17.8 `setf.cpp`

// setf.cpp -- use setf() to control formatting
#include <iostream>
using namespace std;
int main()
{
    int temperature = 63;

    cout << "Today's water temperature: ";
        cout.setf(ios_base::showpos);    // show plus sign
      cout << temperature << "\n";

        cout << "For our programming friends, that's\n";
    cout << hex << temperature << "\n"; // use hex
        cout.setf(ios_base::uppercase);    // use uppercase in hex
        cout.setf(ios_base::showbase);    // use 0X prefix for hex
    cout << "or\n";
        cout << temperature << "\n";
    cout << "How " << true << "!  oops -- How ";
    cout.setf(ios_base::boolalpha);
    cout << true << "!\n";

    return 0;
}

Compatibility Note

graphics/hands.gif

Some implementations may use ios instead of ios_base, and they may fail to provide a boolalpha choice.

Here is the output:

Today's water temperature: +63
For our programming friends, that's
3f
or
0X3F
How 0X1!  oops -- How true!

Note that the plus sign is used only with the base 10 version. C++ treats hexadecimal and octal values as unsigned, hence no sign is needed for them. (However, some implementations may still display a plus sign.)

The second setf() prototype takes two arguments and returns the prior setting:

fmtflags setf(fmtflags , fmtflags );

This overloaded form of the function is used for format choices controlled by more than one bit. The first argument, as before, is a fmtflags value containing the desired setting. The second argument is a value that first clears the appropriate bits. For example, suppose setting bit 3 to 1 means base 10, setting bit 4 to 1 means base 8, and setting bit 5 to 1 means base 16. Suppose output is in base 10 and you want to set it to base 16. Not only do you have to set bit 5 to 1, you also have to set bit 3 to 0—this is called clearing the bit. The clever hex manipulator does both tasks automatically. The setf() requires a bit more work, because you use the second argument to indicate which bits to clear and then use the first argument to indicate which bit to set. This is not as complicated as it sounds, for the ios_base class defines constants (shown in Table 17.2) for this purpose. In particular, you should use the constant ios_base::basefield as the second argument and ios_base::hex as the first argument if you're changing bases. That is, the function call

cout.setf(ios_base::hex, ios_base::basefield);

has the same effect as using the hex manipulator.

Table 17.2. Arguments for `setf(long, long)`
Second Argument	First Argument	Meaning
`ios_base::basefield`	`ios_base::dec`	Use base 10
	`ios_base::oct`	Use base 8
	`ios_base::hex`	Use base 16
`ios_base::floatfield`	`ios_base::fixed`	Use fixed-point notation
	`ios_base::scientific`	Use scientific notation
`ios_base::adjustfield`	`ios_base::left`	Use left-justification
	`ios_base::right`	Use right-justification
	`ios_base::internal`	Left-justify sign or base prefix, right-justify value

The ios_base class defines three sets of formatting flags that can be handled this way. Each set consists of one constant to be used as the second argument and two to three constants to be used as a first argument. The second argument clears a batch of related bits; then the first argument sets one of those bits to 1. Table 17.2 shows the names of the constants used for the second setf() argument, the associated choice of constants for the first argument, and their meanings. For example, to select left-justification, use ios_base::adjustfield for the second argument and ios_base::left as the first argument. Left-justification means starting a value at the left end of the field, and right-justification means ending a value at the right end of the field. Internal justification means placing any signs or base prefixes at the left of the field and the rest of the number at the right of the field. (Unfortunately, C++ does not provide a self- justification mode.)

Fixed-point notation means using the 123.4 style for floating-point values regardless of the size of the number, and scientific notation means using the 1.23e04 style regardless of the size of the number.

Under the Standard, both fixed and scientific notation have the following two properties:

Precision means the number of digits to the right of the decimal rather than the total number of digits.
Trailing zeros are displayed.

Under the older usage, trailing zeros are not shown unless ios::showpoint is set. Also, under older usage, precision always meant the number of digits to the right of the decimal, even in the default mode.

The setf() function is a member function of the ios_base class. Because that's a base class for the ostream class, you can invoke the function using the cout object. For example, to request left-justification, use this call:

ios_base::fmtflags old = cout.setf(ios::left, ios::adjustfield);

To restore the previous setting, do this:

cout.setf(old, ios::adjustfield);

Listing 17.9 illustrates further examples of using setf() with two arguments.

Compatibility Note

graphics/hands.gif

This program uses a math function, and some C++ systems don't automatically search the math library. For example, some UNIX systems require that you do the following:

$ CC setf2.C -lm

The -lm option instructs the linker to search the math library.

Listing 17.9 `setf2.cpp`

// setf2.cpp -- use setf() with 2 arguments to control formatting
#include <iostream>
using namespace std;
#include <cmath>

int main()
{
    // use left justification, show the plus sign, show trailing
    // zeros, with a precision of 3
    cout.setf(ios_base::left, ios_base::adjustfield);
    cout.setf(ios_base::showpos);
    cout.setf(ios_base::showpoint);
    cout.precision(3);
    // use e-notation and save old format setting
    ios_base::fmtflags old = cout.setf(ios_base::scientific,
        ios_base::floatfield);
    cout << "Left Justification:\n";
    long n;
    for (n = 1; n <= 41; n+= 10)
    {
        cout.width(4);
        cout << n << "|";
        cout.width(12);
        cout << sqrt(n) << "|\n";
    }

    // change to internal justification
    cout.setf(ios_base::internal, ios_base::adjustfield);
    // restore default floating-point display style
    cout.setf(old, ios_base::floatfield);

    cout << "Internal Justification:\n";
    for (n = 1; n <= 41; n+= 10)
    {
        cout.width(4);
        cout << n << "|";
        cout.width(12);
        cout << sqrt(n) << "|\n";
    }

    // use right justification, fixed notation
    cout.setf(ios_base::right, ios_base::adjustfield);
    cout.setf(ios_base::fixed, ios_base::floatfield);
    cout << "Right Justification:\n";
    for (n = 1; n <= 41; n+= 10)
    {
        cout.width(4);
        cout << n << "|";
        cout.width(12);
        cout << sqrt(n) << "|\n";
    }

    return 0;
}

Here is the output:

Left Justification:
+1  |+1.000e+00  |
+11 |+3.317e+00  |
+21 |+4.583e+00  |
+31 |+5.568e+00  |
+41 |+6.403e+00  |
Internal Justification:
+  1|+       1.00|
+ 11|+       3.32|
+ 21|+       4.58|
+ 31|+       5.57|
+ 41|+       6.40|
Right Justification:
  +1|      +1.000|
 +11|      +3.317|
 +21|      +4.583|
 +31|      +5.568|
 +41|      +6.403|

Note how a precision of 3 causes the default floating-point display (used for internal justification in this program) to display a total of three digits, while the fixed and scientific modes display three digits to the right of the decimal. (The number of digits displayed in the exponent for e-notation depends upon the implementation.)

The effects of calling setf() can be undone with unsetf(), which has the following prototype:

void unsetf(fmtflags mask);

Here mask is a bit pattern. All bits set to 1 in mask cause the corresponding bits to be unset. That is, setf() sets bits to 1 and unsetf() sets bits back to 0. For example:

cout.setf(ios_base::showpoint);    // show trailing decimal point
cout.unsetf(ios_base::boolalpha);  // don't show trailing decimal point
cout.setf(ios_base::boolalpha);    // display true, false
cout.unsetf(ios_base::boolalpha);  // display 1, 0

Standard Manipulators

Using setf() is not the most user-friendly approach to formatting, so C++ offers several manipulators to invoke setf() for you, automatically supplying the right arguments. You've already seen dec, hex, and oct. These manipulators, most of which are not available to older implementations, work like hex. For example, the statement

cout << left << fixed;

turns on left justification and the fixed decimal point option. Table 17.3 lists these along with several other manipulators.

Tip

graphics/bulb.gif

If your system supports these manipulators, take advantage of them; if it doesn't, you still have the option of using setf().

Table 17.3. Some Standard Manipulators
Manipulator	Calls
`boolalpha`	`setf(ios_base::boolalpha)`
`noboolalpha`	`unset(ios_base::noboolalpha)`
`showbase`	`setf(ios_base::showbase)`
`noshowbase`	`unsetf(ios_base::showbase)`
`showpoint`	`setf(ios_base::showpoint)`
`noshowpoint`	`unsetf(ios_base::showpoint)`
`showpos`	`setf(ios_base::showpos)`
`noshowpos`	`unsetf(ios_base::showpos)`
`uppercase`	`setf(ios_base::uppercase)`
`nouppercase`	`unsetf(ios_base::uppercase)`
`internal`	`setf(ios_base::internal, ios_base::adjustfield)`
`left`	`setf(ios_base::left, ios_base::adjustfield)`
`right`	`setf(ios_base::right, ios_base::adjustfield)`
`dec`	`setf(ios_base::dec, ios_base::basefield)`
`hex`	`setf(ios_base::hex, ios_base::basefield)`
`oct`	`setf(ios_base::oct, ios_base::basefield)`
`fixed`	`setf(ios_base::fixed, ios_base::floatfield)`
`scientific`	`setf(ios_base::scientific, ios_base::floatfield)`

The `iomanip` Header File

Setting some format values, such as the field width, can be awkward using the iostream tools. To make life easier, C++ supplies additional manipulators in the iomanip header file. They provide the same services we've discussed, but in a notationally more convenient manner. The three most commonly used are setprecision() for setting the precision, setfill() for setting the fill character, and setw() for setting the field width. Unlike the manipulators discussed previously, these take arguments. The setprecision() manipulator takes an integer argument specifying the precision, the setfill() takes a char argument indicating the fill character, and the setw() manipulator takes an integer argument specifying the field width. Because they are manipulators, they can be concatenated in a cout statement. This makes the setw() manipulator particularly convenient when displaying several columns of values. Listing 17.10 illustrates this by changing the field width and fill character several times for one output line. It also uses some of the newer standard manipulators.

Compatibility Note

graphics/hands.gif

This program uses a math function, and some C++ systems don't automatically search the math library. For example, some UNIX systems require that you do the following:

$ CC iomanip.C -lm

The -lm option instructs the linker to search the math library. Also, older compilers may not recognize the new standard manipulators such as showpoint. In that case, you can use the setf() equivalents.

Listing 17.10 `iomanip.cpp`

// iomanip.cpp -- use manipulators from iomanip
// some systems require explicitly linking the math library
#include <iostream>
using namespace std;
#include <iomanip>
#include <cmath>

int main()
{
    // use new standard manipulators
    cout << showpoint << fixed << right;

    // use iomanip manipulators
    cout << setw(6) << "N" << setw(14) << "square root"
         << setw(15) << "fourth root\n";

    double root;
    for (int n = 10; n <=100; n += 10)
    {
        root = sqrt(n);
        cout << setw(6) << setfill('.') << n << setfill(' ')
               << setw(12) << setprecision(3) << root
               << setw(14) << setprecision(4) << sqrt(root)
               << "\n";
    }

    return 0;
}

Here is the output:

     N   square root   fourth root
....10       3.162        1.7783
....20       4.472        2.1147
....30       5.477        2.3403
....40       6.325        2.5149
....50       7.071        2.6591
....60       7.746        2.7832
....70       8.367        2.8925
....80       8.944        2.9907
....90       9.487        3.0801
...100      10.000        3.1623

Now you can produce neatly aligned columns. Note that this program produces the same formatting with either the older or current implementations. Using the showpoint manipulator causes trailing zeros to be displayed in older implementations, and using the fixed manipulator causes trailing zeros to be displayed in current implementations. Using fixed makes the display fixed-point in either system, and in current systems it makes precision refer to the number of digits to the right of the decimal. In older systems, precision always has that meaning, regardless of the floating-point display mode.

Table 17.4 summarizes some of the differences between older C++ formatting and the current state. One moral of this table is that you shouldn't feel baffled if you run an example program you've seen somewhere and the output format doesn't match what is shown for the example.

Table 17.4. Formatting Changes
Feature	Older C++	Current C++
`precision(n)`	Display `n` digits to the right of the decimal point	Display a total of `n` digits in the default mode, and display `n` digits to the right of the decimal point in fixed and scientific modes
`ios::showpoint`	Display trailing decimal point and trailing zeros	Display trailing decimal point
`ios::fixed`, `ios::scientific`		Show trailing zeros (also see comments under `precision()`)

Input with `cin`

Now it's time to turn to input and getting data into a program. The cin object represents the standard input as a stream of bytes. Normally, you generate that stream of characters at the keyboard. If you type the character sequence 2002, the cin object extracts those characters from the input stream. You may intend that input to be part of a string, to be an int value, to be a float value, or to be some other type. Thus, extraction also involves type conversion. The cin object, guided by the type of variable designated to receive the value, must use its methods to convert that character sequence into the intended type of value.

Typically, you use cin as follows:

cin >> value_holder;

Here value_holder identifies the memory location in which to store the input. It can be the name of a variable, a reference, a dereferenced pointer, or a member of a structure or of a class. How cin interprets the input depends on the data type for value_holder. The istream class, defined in the iostream header file, overloads the >> extraction operator to recognize the following basic types:

signed char &
unsigned char &
char &
short &
unsigned short &
int &
unsigned int &
long &
unsigned long &
float &
double &
long double &

These are referred to as formatted input functions because they convert the input data to the format indicated by the target.

A typical operator function has a prototype like the following:

istream & operator>>(int &);

Both the argument and the return value are references. A reference argument (see Chapter 8, "Adventures in Functions") means that a statement such as

cin >> staff_size;

causes the operator>>() function to work with the variable staff_size itself rather than with a copy, as would be the case with a regular argument. Because the argument type is a reference, cin is able to modify directly the value of a variable used as an argument. The preceding statement, for example, directly modifies the value of the staff_size variable. We'll get to the significance of a reference return value in a moment. First, let's examine the type conversion aspect of the extraction operator. For arguments of each type in the preceding list of types, the extraction operator converts the character input to the indicated type of value. For example, suppose staff_size is type int. Then the compiler matches the

cin >> staff_size;

to the following prototype:

istream & operator>>(int &);

The function corresponding to that prototype then reads the stream of characters being sent to the program, say, the characters 2, 3, 1, 8, and 4. For a system using a 2-byte int, the function then converts these characters to the 2-byte binary representation of the integer 23184. If, on the other hand, staff_size had been type double, cin would use the operator>>(double &) to convert the same input into the 8-byte floating-point representation of the value 23184.0.

Incidentally, you can use the hex, oct, and dec manipulators with cin to specify that integer input is to be interpreted as hexadecimal, octal, or decimal format. For example, the statement

cin >> hex;

causes an input of 12 or 0x12 to be read as hexadecimal 12, or decimal 18, and ff or FF to be read as decimal 255.

The istream class also overloads the >> extraction operator for character pointer types:

signed char *
char *
unsigned char *

For this type of argument, the extraction operator reads the next word from input and places it at the indicated address, adding a null character to make a string. For example, suppose you have this code:

cout << "Enter your first name:\n";
char name[20];
cin >> name;

If you respond to the request by typing Liz, the extraction operator places the characters Liz\0 in the name array. (As usual, \0 represents the terminating null character.) The name identifier, being the name of a char array, acts as the address of the array's first element, making name type char * (pointer-to-char).

The fact that each extraction operator returns a reference to the invoking object lets you concatenate input, just as you can concatenate output:

char name[20];
float fee;
int group;
cin >> name >> fee >> group;

Here, for example, the cin object returned by cin >> name becomes the object handling fee.

How `cin >>` Views Input

The various versions of the extraction operator share a common way of looking at the input stream. They skip over white space (blanks, newlines, and tabs) until they encounter a nonwhite-space character. This is true even for the single-character modes (those in which the argument is type char, unsigned char, or signed char), which is not true of C's character input functions (see Figure 17.5). In the single-character modes, the >> operator reads that character and assigns it to the indicated location. In the other modes, the operator reads in one unit of the indicated type. That is, it reads everything from the initial nonwhite-space character up to the first character that doesn't match the destination type.

Figure 17.5. `cin >>` skips over whitespace.

graphics/17fig05.gif

For example, consider the following code:

int elevation;
cin >> elevation;

Suppose you type the following characters:

-123Z

The operator will read the -, 1, 2, and 3 characters, because they are all valid parts of an integer. But the Z character isn't valid, so the last character accepted for input is the 3. The Z remains in the input stream, and the next cin statement will start reading at that point. Meanwhile, the operator converts the character sequence -123 to an integer value and assigns it to elevation.

It can happen that input fails to meet a program's expectation. For example, suppose you entered Zcar instead of -123Z. In that case, the extraction operator leaves the value of elevation unchanged and returns the value zero. (More technically, an if or while statement evaluates an istream object as false if it's had an error state set—we'll discuss this in more depth later in this chapter.) The false return value allows a program to check whether input meets the program requirements, as Listing 17.11 shows.

Listing 17.11 `check_it.cpp`

// check_it.cpp
#include <iostream>
using namespace std;

int main()
{
    cout.precision(2);
    cout << showpoint << fixed;
    cout << "Enter numbers: ";

    double sum = 0.0;
    double input;
    while (cin >> input)
    {
        sum += input;
    }

    cout << "Last value entered = " << input << "\n";
    cout << "Sum = " << sum << "\n";
    return 0;
}

Compatibility Note

graphics/hands.gif

If your compiler doesn't support the showpoint and fixed manipulators, use the setf() equivalents.

Here's the output when some inappropriate input (-123Z) sneaks into the input stream:

Enter numbers: 200.0
1.0E1 -50 -123Z 60
Last value entered = -123.00
Sum = 37.00

Because input is buffered, the second line of keyboard input values didn't get sent to the program until we typed <Enter> at the end of the line. But the loop quit processing input at the Z character, because it didn't match any of the floating-point formats. The failure of input to match the expected format, in turn, caused the expression cin>> input to evaluate to false, thus terminating the while loop.

Stream States

Let's take a closer look at what happens for inappropriate input. A cin or cout object contains a data member (inherited from the ios_base class) that describes the stream state. A stream state (defined as type iostate, which, in turn, is a bitmask type, such as described earlier) consists of the three ios_base elements: eofbit, badbit, or failbit. Each element is a single bit that can be 1 (set) or 0 (cleared). When a cin operation reaches the end of a file, it sets the eofbit. When a cin operation fails to read the expected characters, as in the earlier example, it sets the failbit. I/O failures, such as trying to read a non-accessible file or trying to write to a write-protected diskette, also can set failbit to 1. The badbit element is set when some undiagnosed failure may have corrupted the stream. (Implementations don't necessarily agree about which events set failbit and which set badbit.) When all three of these state bits are set to 0, everything is fine. Programs can check the stream state and use that information to decide what to do next. Table 17.5 lists these bits along with some ios_base methods that report or alter the stream state. (Older compilers don't provide the two exceptions() methods.)

Table 17.5. Stream States
Member	Description
`eofbit`	Set to 1 if end-of-file reached.
`badbit`	Set to 1 if the stream may be corrupted; for example, there could have been a file read error.
`failbit`	Set to 1 if an input operation failed to read the expected characters or an output operation failed to write the expected characters.
`goodbit`	Just another way of saying 0.
`good()`	Returns `true` if the stream can be used (all bits are cleared).
`eof()`	Returns `true` if `eofbit` is set.
`bad()`	Returns `true` if `badbit` is set.
`fail()`	Returns `true` if `badbit` or `failbit` is set.
`rdstate()`	Returns the stream state.
`exceptions()`	Returns a bit mask identifying which flags cause an exception to be thrown.
`exceptions(iostate ex)`	Sets which states will cause `clear()` to throw an exception; for example, if `ex` is `eofbit`, then `clear()` will throw an exception if `eofbit` is set.
`clear(iostate s)`	Sets the stream state to `s`; the default for `s` is 0 (`goodbit`); throws a `basic_ios::failure` exception if `rdstate() & exceptions()) != 0`.
`setstate` `(iostate s)`	Calls `clear(rdstate() \| s)`. This sets stream state bits corresponding to those bits set in `s`; other stream state bits are left unchanged.

Setting States

Two of the methods in Table 17.5, clear() and setstate(), are similar. Both reset the state, but in a different fashion. The clear() method sets the state to its argument. Thus, the call

clear();

uses the default argument of 0, which clears all three state bits (eofbit, badbit, and failbit). Similarly, the call

clear(eofbit);

makes the state equal to eofbit; that is, the eofbit is set and the other two state bits are cleared.

The setstate() method, however, affects only those bits that are set in its argument. Thus, the call

setstate(eofbit);

sets eofbit without affecting the other bits. So if failbit were already set, it stays set.

Why would you reset the stream state? For a program writer, the most common reason is to use clear() with no argument to reopen input after encountering mismatched input or end-of-file; whether or not doing so makes sense depends on what the program is trying to accomplish. You'll see some examples shortly. The main purpose for setstate() is to provide a means for input and output functions to change the state. For example, if num is an int, the call

cin >> num;  // read an int

can result in operator>>(int &) using setstate() to set failbit or eofbit.

I/O and Exceptions

Suppose, say, an input function sets eofbit. Does this cause an exception to be thrown? By default, the answer is no. However, you can use the exceptions() method to control how exceptions are handled.

First, here's some background. The exceptions() method returns a bitfield with three bits corresponding to eofbit, failbit, and badbit. Changing the stream state involves either clear() or setstate(), which uses clear(). After changing the stream state, the clear() method compares the current stream state to the value returned by exceptions(). If a bit is set in the return value and the corresponding bit is set in the current state, clear() throws an ios_base::failure exception. This would happen, for example, if both values had badbit set. It follows that if exceptions() returns goodbit, no exceptions are thrown. The ios_base::failure exception class derives from the std::exception class and thus has a what() method.

The default setting for exceptions() is goodbit, that is, no exceptions thrown. However, the overloaded exceptions(iostate) function gives you control over the behavior:

cin.exceptions(badbit);  // setting badbit causes exception to be thrown

The bitwise OR operator (|), as discussed in Appendix E, allows you to specify more than one bit. For example, the statement

cin.exceptions(badbit | eofbit);

results in an exception being thrown if either badbit or eofbit subsequently is set.

Listing 17.12 modifies Listing 17.11 so that the program throws and catches an exception if failbit is set.

Listing 17.12 `cinexcp.cpp`

// cinexcp.cpp
#include <iostream>
#include <exception>
using namespace std;

int main()
{
    // have failbit cause an exception to be thown
    cin.exceptions(ios_base::failbit);
    cout.precision(2);
    cout << showpoint << fixed;
    cout << "Enter numbers: ";
    double sum = 0.0;
    double input;
    try {
        while (cin >> input)
        {
            sum += input;
        }
    } catch(ios_base::failure & bf)
    {
        cout << bf.what() << endl;
        cout << "O! the horror!\n";
    }

    cout << "Last value entered = " << input << "\n";
    cout << "Sum = " << sum << "\n";
    return 0;
}

Here is a sample run; the what() message depends upon the implementation:

Enter numbers: 20 30 40 pi 6
ios_base failure in clear
O! the horror!
Last value entered = 40.00
Sum = 90.00

Stream State Effects

An if or while test such as

while (cin >> input)

tests as true only if the stream state is good (all bits cleared). If a test fails, you can use the member functions in Table 17.5 to discriminate among possible causes. For example, you could modify the central part of Listing 17.11 to look like this:

while (cin >> input)
{
    sum += input;
}
if (cin.eof())
    cout << "Loop terminated because EOF encountered\n";

Setting a stream state bit has a very important consequence: The stream is closed for further input or output until the bit is cleared. For example, the following code won't work:

while (cin >> input)
{
    sum += input;
}
cout << "Last value entered = " << input << "\n";
cout << "Sum = " << sum << "\n";
cout << "Now enter a new number: ";
cin >> input;   // won't work

If you want a program to read further input after a stream state bit has been set, you have to reset the stream state to good. This can be done by calling the clear()method:

while (cin >> input)
{
    sum += input;
}
cout << "Last value entered = " << input << "\n";
cout << "Sum = " << sum << "\n";
cout << "Now enter a new number: ";
cin.clear();        // reset stream state
while (!isspace(cin.get()))
    continue;     // get rid of bad input
cin >> input;       // will work now

Note that it is not enough to reset the stream state. The mismatched input that terminated the input loop still is in the input queue, and the program has to get past it. One way is to keep reading characters until reaching white space. The isspace() function (see Chapter 6, "Branching Statements and Logical Operators") is a cctype function that returns true if its argument is a white space character. Or you can discard the rest of the line instead of just the next word:

while (cin.get() != '\n')
    continue;  // get rid rest of line

This example assumes that the loop terminated because of inappropriate input. Suppose, instead, the loop terminated because of end-of-file or because of a hardware failure. Then the new code disposing of bad input makes no sense. You can fix matters by using the fail() method to test whether the assumption was correct. Because, for historical reasons, fail() returns true if either failbit or badbit is set, the code has to exclude the latter case.

while (cin >> input)
{
    sum += input;
}
cout << "Last value entered = " << input << "\n";
cout << "Sum = " << sum << "\n";
if (cin.fail() && !cin.bad() ) // failed because of mismatched input
{

      cin.clear();      // reset stream state
      while (!isspace(cin.get()))
           continue;    // get rid of bad input
}
else // else bail out
{
      cout << "I cannot go on!\n";
      exit(1);
}
cout << "Now enter a new number: ";
cin >> input;  // will work now

Other `istream` Class Methods

Chapters 3, 4, "Compound Types," and 5, "Loops and Relational Expressions," discuss the get() and getline() methods. As you may recall, they provide the following additional input capabilities:

The get(char &) and get(void) methods provide single-character input that doesn't skip over white space.
The get(char *, int, char) and getline(char *, int, char) functions read entire lines by default rather than single words.

These are termed unformatted input functions because they simply read character input as it is without skipping over whitespace and without performing data conversions.

Let's look at these two groups of istream class member functions.

Single-Character Input

When used with a char argument or no argument at all, the get() methods fetch the next input character, even if it is a space, tab, or newline character. The get(char & ch) version assigns the input character to its argument, while the get(void) version uses the input character, converted to an integer type, typically int, as its return value.

Let's try get(char &) first. Suppose you have the following loop in a program:

int ct = 0;
char ch;
cin.get(ch);
while (ch != '\n')
{
    cout << ch;
    ct++;
    cin.get(ch);
}
cout << ct << '\n';

Next, suppose you type the following optimistic input:

I C++ clearly.<Enter>

Pressing the <Enter> key sends this input line to the program. The program fragment will first read the I character, display it with cout, and increment ct to 1. Next, it will read the space character following the I, display it, and increment ct to 2. This continues until the program processes the <Enter> key as a newline character and terminates the loop. The main point here is that, by using get(ch), the code reads, displays, and counts the spaces as well as the printing characters.

Suppose, instead, that the program had tried to use >>:

int ct = 0;
char ch;
cin >> ch;
while (ch != '\n')    // FAILS
{
    cout << ch;
    ct++;
    cin >> ch;
}
cout << ct << '\n';

First, the code would skip the spaces, thus not counting them and compressing the corresponding output to this:

IC++clearly.

Worse, the loop would never terminate! Because the extraction operator skips newlines, the code would never assign the newline character to ch, so the while loop test would never terminate the loop.

The get(char &) member function returns a reference to the istream object used to invoke it. This means you can concatenate other extractions following get(char &):

char c1, c2, c3;
cin.get(c1).get(c2) >> c3;

First, cin.get(c1) assigns the first input character to c1 and returns the invoking object, which is cin. This reduces the code to cin.get(c2) >> c3, which assigns the second input character to c2. The function call returns cin, reducing the code to cin >> c3. This, in turn, assigns the next nonwhite-space character to c3. Note that c1 and c2 could wind up being assigned white space, but c3 couldn't.

If cin.get(char &) encounters the end of a file, either real or simulated from the keyboard (<Ctrl>-<Z> for DOS, <Ctrl>-<D> at the beginning of a line for UNIX), it does not assign a value to its argument. This is quite right, for if the program has reached the end of the file, there is no value to be assigned. Furthermore, the method calls setstate(failbit), which causes cin to test as false:

char ch;
while (cin.get(ch))
{
     // process input
}

As long as there's valid input, the return value for cin.get(ch) is cin, which evaluates as true, so the loop continues. Upon reaching end-of-file, the return value evaluates as false, terminating the loop.

The get(void) member function also reads white space, but it uses its return value to communicate input to a program. So you would use it this way:

int ct = 0;
char ch;
ch = cin.get();      // use return value
while (ch != '\n')
{
    cout << ch;
    ct++;
    ch = cin.get();
}
cout << ct << '\n';

Some older C++ implementation functions don't provide this member function.

The get(void) member function returns type int (or some larger integer type, depending upon the character set and locale). This makes the following invalid:

char c1, c2, c3;
cin.get().get() >> c3;  // not valid

Here cin.get() returns a type int value. Because that return value is not a class object, you can't apply the membership operator to it. Thus, you get a syntax error. However, you can use get() at the end of an extraction sequence:

char c1;
cin.get(c1).get();   // valid

The fact that get(void) returns type int means you can't follow it with an extraction operator. But, because cin.get(c1) returns cin, it makes it a suitable prefix to get(). This particular code would read the first input character, assign it to c1, then read the second input character and discard it.

Upon reaching the end-of-file, real or simulated, cin.get(void) returns the value EOF, which is a symbolic constant provided by the iostream header file. This design feature allows the following construction for reading input:

int ch;
while ((ch = cin.get()) != EOF)
{
    // process input
}

You should use type int for ch instead of type char here because the value EOF may not be expressed as a char type.

Chapter 5 describes these functions in a bit more detail, and Table 17.6 summarizes the features of the single-character input functions.

Table 17.6. `cin.get(ch)` Versus `cin.get()`
Property	`cin.get(ch)`	`ch = cin.get()`
Method for conveying input character	Assign to argument `ch`	Use function return value to assign to `ch`
Function return value for character input	Reference to a class `istream` object	Code for character as type `int` value
Function return value at end-of-file	Converts to `false`	`EOF`

Which Form of Single-Character Input?

Given the choice of >>, get(char &), and get(void), which should you use? First, decide whether you want input to skip over white space or not. If skipping white space is more convenient, use the extraction operator >>. For example, skipping white space is convenient for offering menu choices:

cout  << "a. annoy client         b. bill client\n"
      << "c. calm client          d. deceive client\n"
      << "q.\n";
cout  << "Enter a, b, c, d, or q: ";
char ch;
cin >> ch;
while (ch != 'q')
{
    switch(ch)
    {
        ...
    }
    cout << "Enter a, b, c, d, or q: ";
    cin >> ch;
}

To enter, say, a b response, you type b and press <Enter>, generating the two-character response of b\n. If you used either form of get(), you would have to add code to process that \n character each loop cycle, but the extraction operator conveniently skips it. (If you've programmed in C, you've probably encountered the situation in which the newline appears to the program as an invalid response. It's an easy problem to fix, but it is a nuisance.)

If you want a program to examine every character, use one of the get() methods. For example, a word-counting program could use white space to determine when a word came to an end. Of the two get() methods, the get(char &) method has the classier interface. The main advantage of the get(void) method is that it closely resembles the standard C getchar() function, letting you convert a C to a C++ program by including iostream instead of stdio.h, globally replacing getchar() with cin.get(), and globally replacing C's putchar(ch) with cout.put(ch).

String Input: `getline()`, `get()`, and `ignore()`

Next, let's review the string input member functions introduced in Chapter 4. The getline() member function and the third version of get() both read strings, and both have the same function signature (here simplified from the more general template declaration):

istream & get(char *, int, char = '\n');
istream & getline(char *, int, char = '\n');

The first argument, recall, is the address of the location to place the input string. The second argument is one greater than the maximum number of characters to be read. (The additional character leaves space for the terminating null character used in storing the input as a string.) If you omit the third argument, each function reads up to the maximum characters or until it encounters a newline character, whichever comes first.

For example, the code

char line[50];
cin.get(line, 50);

reads character input into the character array line. The cin.get() function quits reading input into the array after encountering 49 characters or, by default, after encountering a newline character, whichever comes first. The chief difference between get() and getline() is that get() leaves the newline character in the input stream, making it the first character seen by the next input operation, while getline() extracts and discards the newline character from the input stream.

Chapter 4 illustrated using the default form for these two member functions. Now let's look at the final argument, which modifies the function's default behavior. The third argument, which has a default value of '\n', is the termination character. Encountering the termination character causes input to cease even if the maximum number of characters hasn't been reached. So, by default, both methods quit reading input if they reach the end of a line before reading the allotted number of characters. Just as in the default case, get() leaves the termination character in the input queue, while getline() does not.

Listing 17.13 demonstrates how getline() and get() work. It also introduces the ignore() member function. It takes two arguments: a number specifying a maximum number of characters to read and a character that acts as a terminating character for input. For example, the function call

cin.ignore(255, '\n');

reads and discards the next 255 characters or up through the first newline character, whichever comes first. The prototype provides defaults of 1 and EOF for the two arguments, and the function return type is istream &:

istream & ignore(int = 1, int = EOF);

The function returns the invoking object. This lets you concatenate function calls, as in the following:

cin.ignore(255, '\n').ignore(8255, '\n');

Here the first ignore() method reads and discards one line, and the second call reads and discards the second line. Together the two functions read through two lines.

Now check out Listing 17.13.

Listing 17.13 `get_fun.cpp`

// get_fun.cpp -- using get() and getline()
#include <iostream>
using namespace std;
const int Limit = 255;

int main()
{
    char input[Limit];

    cout << "Enter a string for getline() processing:\n";
    cin.getline(input, Limit, '#');
    cout << "Here is your input:\n";
    cout << input << "\nDone with phase 1\n";

    char ch;
    cin.get(ch);
    cout << "The next input character is " << ch << "\n";

    if (ch != '\n')
        cin.ignore(Limit, '\n');    // discard rest of line

    cout << "Enter a string for get() processing:\n";
    cin.get(input, Limit, '#');
    cout << "Here is your input:\n";
    cout << input << "\nDone with phase 2\n";

    cin.get(ch);
    cout << "The next input character is " << ch << "\n";

    return 0;
}

Compatibility Note

graphics/hands.gif

The Microsoft Visual C++ 6.0 iostream version of getline() has a bug causing the display of the next output line to be delayed until after you enter the data requested by the undisplayed line. The iostream.h version, however, works properly.

Here is a sample program run:

Enter a string for getline() processing:
Please pass
me a #3 melon!
Here is your input:
Please pass
me a
Done with phase 1
The next input character is 3
Enter a string for get() processing:
I still
want my #3 melon!
Here is your input:
I still
want my
Done with phase 2
The next input character is #

Note that the getline() function discards the # termination character in the input, while the get() function does not.

Unexpected String Input

Some forms of input for get(char *, int) and getline() affect the stream state. As with the other input functions, encountering end-of-file sets eofbit, and anything that corrupts the stream, such as device failure, sets badbit. Two other special cases are no input and input that meets or exceeds the maximum number of characters specified by the function call. Let's look at those cases now.

If either method fails to extract any characters, the method places a null character into the input string and uses setstate() to set failbit. (Older C++ implementations don't set failbit if no characters are read.) When would a method fail to extract any characters? One possibility is if an input method immediately encounters end-of-file. For get(char *, int), another possibility is if you enter an empty line:

char temp[80];
while (cin.get(temp,80))  // terminates on empty line
      ...

Interestingly, an empty line does not cause getline() to set failbit. That's because getline() still extracts the newline character, even if it doesn't store it. If you want a getline() loop to terminate on an empty line, you can write it this way:

char temp[80];
while (cin.getline(temp,80) && temp[0] != '\0') // terminates on empty line

Now suppose the number of characters in the input queue meets or exceeds the maximum specified by the input method. First, consider getline() and the following code:

char temp[30];
while (cin.getline(temp,30))

The getline() method will read consecutive characters from the input queue, placing them in successive elements of the temp array, until (in order of testing) EOF is encountered, the next character to be read is the newline character, or until 29 characters have been stored. If EOF is encountered, eofbit is set. If the next character to be read is a newline character, that character is read and discarded. And if 29 characters were read, failbit is set, unless the next character is a newline. Thus, an input line of 30 characters or more will terminate input.

Now consider the get(char *, int) method. It tests the number of characters first, end-of-file second, and for the next character being a newline third. It does not set the failbit flag if it reads the maximum number of characters. Nonetheless, you can tell if too many input characters caused the method to quit reading. You can use peek() (see the next section) to examine the next input character. If it's a newline, then get() must have read the entire line. If it's not a newline, then get() must have stopped before reaching the end. This technique doesn't necessarily work with getline() because getline() reads and discards the newline, so looking at the next character doesn't tell you anything. But if you use get(), you have the option of doing something if less than an entire line is read. The next section includes an example of this approach. Meanwhile, Table 17.7 summarizes some of the differences between older C++ input methods and the current standard.

Table 17.7. Changes in Input Behavior
Method	Older C++	Current C++
`getline()`	Doesn't set `failbit` if no characters are read.	Sets `failbit` if no characters are read (but newline counts as a character read).
	Doesn't set `failbit` if maximum number of characters are read	Sets `failbit` if maximum number of characters read and more are still left in the line.
`get(char *, int)`	Doesn't set `failbit` if no characters are read.	Sets `failbit` if no characters are read.

Other `istream` Methods

Other istream methods include read(), peek(), gcount(), and putback(). The read() function reads a given number of bytes, storing them in the specified location. For example, the statement

char gross[144];
cin.read(gross, 144);

reads 144 characters from the standard input and places them in the gross array. Unlike getline() and get(), read() does not append a null character to input, so it doesn't convert input to string form. The read() method is not primarily intended for keyboard input. Instead, it most often is used in conjunction with the ostream write() function for file input and output. The method's return type is istream &, so it can be concatenated as follows:

char gross[144];
char score[20];
cin.read(gross, 144).read(score, 20);

The peek() function returns the next character from input without extracting from the input stream. That is, it lets you peek at the next character. Suppose you wanted to read input up to the first newline or period, whichever comes first. You can use peek() to peek at the next character in the input stream in order to judge whether to continue or not:

char great_input[80];
char ch;
int i = 0;
while ((ch = cin.peek()) != '.' && ch != '\n')
    cin.get(great_input[i++]);
great_input [i] = '\0';

The call to cin.peek() peeks at the next input character and assigns its value to ch. Then the while loop test condition checks that ch is neither a period nor a newline. If this is the case, the loop reads the character into the array, and updates the array index. When the loop terminates, the period or newline character remains in the input stream, positioned to be the first character read by the next input operation. Then the code appends a null character to the array, making it a string.

The gcount() method returns the number of characters read by the last unformatted extraction method. That means characters read by a get(), getline(), ignore(), or read() method but not by the extraction operator (>>), which formats input to fit particular data types. For example, suppose you've just used cin.get(myarray, 80) to read a line into the myarray array and want to know how many characters were read. You could use the strlen() function to count the characters in the array, but it would be quicker to use cin.gcount() to report how many characters were just read from the input stream.

The putback() function inserts a character back in the input string. The inserted character then becomes the first character read by the next input statement. The putback() method takes one char argument, which is the character to be inserted, and it returns type istream &, which allows the call to be concatenated with other istream methods. Using peek() is like using get() to read a character, then using putback() to place the character back in the input stream. However, putback() gives you the option of putting back a character different from the one just read.

Listing 17.14 uses two approaches to read and echo input up to, but not including, a # character. The first approach reads through the # character and then uses putback() to insert the character back in the input. The second approach uses peek() to look ahead before reading input.

Listing 17.14 `peeker.cpp`

// peeker.cpp -- some istream methods
#include <iostream>
using namespace std;
#include <cstdlib>              // or stdlib.h

int main()
{

//  read and echo input up to a # character
    char ch;

    while(cin.get(ch))          // terminates on EOF
    {
        if (ch != '#')
            cout << ch;
        else
        {
            cin.putback(ch);    // reinsert character
            break;
        }
    }

    if (!cin.eof())
    {
        cin.get(ch);
        cout << '\n' << ch << " is next input character.\n";
    }
    else
    {
        cout << "End of file reached.\n";
        exit(0);
    }

    while(cin.peek() != '#')    // look ahead
    {
        cin.get(ch);
        cout << ch;
    }
    if (!cin.eof())
    {
        cin.get(ch);
        cout << '\n' << ch << " is next input character.\n";
    }
    else
        cout << "End of file reached.\n";

    return 0;
}

Here is a sample run:

I used a #3 pencil when I should have used a #2.
I used a
# is next input character.
3 pencil when I should have used a
# is next input character.

Program Notes

Let's look more closely at some of the code. The first approach uses a while loop to read input. The expression (cin.get(ch)) returns false on reaching the end-of-file condition, so simulating end-of-file from the keyboard terminates the loop. If the # character shows up first, the program puts the character back in the input stream and uses a break statement to terminate the loop.

while(cin.get(ch))            // terminates on EOF
{
    if (ch != '#')
        cout << ch;
    else
    {
        cin.putback(ch);  // reinsert character
        break;
    }
}

The second approach is simpler in appearance:

while(cin.peek() != '#')     // look ahead
{
    cin.get(ch);
    cout << ch;
}

The program peeks at the next character. If it is not the # character, the program reads the next character, echoes it, and peeks at the next character. This continues until the terminating character shows up.

Now let's look, as promised, at an example (Listing 17.15) that uses peek() to determine whether or not an entire line has been read. If only part of a line fits in the input array, the program discards the rest of the line.

Listing 17.15 `truncate.cpp`

// truncate.cpp -- use get() to truncate input line, if necessary
#include <iostream>
using namespace std;
const int SLEN = 10;
inline void eatline() { while (cin.get() != '\n') continue; }
int main()
{
    char name[SLEN];
    char title[SLEN];
    cout << "Enter your name: ";
    cin.get(name,SLEN);
    if (cin.peek() != '\n')
        cout << "Sorry, we only have enough room for "
                << name << endl;
    eatline();
    cout << "Dear " << name << ", enter your title: \n";
    cin.get(title,SLEN);
    if (cin.peek() != '\n')
        cout << "We were forced to truncate your title.\n";
    eatline();
    cout << " Name: " << name
         << "\nTitle: " << title << endl;

    return 0;
}

Here is a sample run:

Enter your name: Stella Starpride
Sorry, we only have enough room for Stella St
Dear Stella St, enter your title:
Astronomer Royal
We were forced to truncate your title.
 Name: Stella St
Title: Astronome

Note that the following code makes sense whether or not the first input statement read the entire line:

while (cin.get() != '\n') continue;

If get() reads the whole line, it still leaves the newline in place, and this code reads and discards the newline. If get() reads just part of the line, this code reads and discards the rest of the line. If you didn't dispose of the rest of line, the next input statement would begin reading at the beginning of the remaining input on the first input line. With this example, that would have resulted in the program reading the string arpride into the title array.

File Input and Output

Most computer programs work with files. Word processors create document files. Database programs create and search files of information. Compilers read source code files and generate executable files. A file itself is a bunch of bytes stored on some device, perhaps magnetic tape, perhaps an optical disk, floppy disk, or hard disk. Typically, the operating system manages files, keeping track of their locations, their sizes, when they were created, and so on. Unless you're programming on the operating system level, you normally don't have to worry about those things. What you do need is a way to connect a program to a file, a way to have a program read the contents of a file, and a way to have a program create and write to files. Redirection (as discussed earlier in this chapter) can provide some file support, but it is more limited than explicit file I/O from within a program. Also, redirection comes from the operating system, not from C++, so it isn't available on all systems. We'll look now at how C++ deals with explicit file I/O from within a program.

The C++ I/O class package handles file input and output much as it handles standard input and output. To write to a file, you create a stream object and use the ostream methods, such as the << insertion operator or write(). To read a file, you create a stream object and use the istream methods, such as the >> extraction operator or get(). Files require more management than the standard input and output, however. For example, you have to associate a newly opened file with a stream. You can open a file in read-only mode, write-only mode, or read-and-write mode. If you write to a file, you might want to create a new file, replace an old file, or add to an old file. Or you might want to move back and forth through a file. To help handle these tasks, C++ defines several new classes in the fstream (formerly fstream.h) header file, including an ifstream class for file input and an ofstream class for file output. C++ also defines an fstream class for simultaneous file I/O. These classes are derived from the classes in the iostream header file, so objects of these new classes will be able to use the methods you've already learned.

Simple File I/O

Suppose you want a program to write to a file. You must do the following:

Create an ofstream object to manage the output stream.
Associate that object with a particular file.
Use the object the same way you would use cout; the only difference is that output goes to the file instead of to the screen.

To accomplish this, begin by including the fstream header file. Including this file automatically includes the iostream file for most, but not all, implementations, so you may not have to include iostream explicitly. Then declare an ofstream object:

ofstream fout;      // create an ofstream object named fout

The object's name can be any valid C++ name, such as fout, outFile, cgate, or didi.

Next, you must associate this object with a particular file. You can do so by using the open() method. Suppose, for example, you want to open the cookies file for output. You would do the following:

fout.open("cookies");  // associate fout with cookies

You can combine these two steps (creating the object and associating a file) into a single statement by using a different constructor:

ofstream fout("cookies");  // create fout object and associate with cookies

When you've gotten this far, use fout (or whatever name you choose) in the same manner as cout. For example, if you want to put the words Dull Data into the file, you can do the following:

fout << "Dull Data";

Indeed, because ostream is a base class for the ofstream class, you can use all the ostream methods, including the various insertion operator definitions and the formatting methods and manipulators. The ofstream class uses buffered output, so the program allocates space for an output buffer when it creates an ofstream object like fout. If you create two ofstream objects, the program creates two buffers, one for each object. An ofstream object like fout collects output byte-by-byte from the program; then, when the buffer is filled, it transfers the buffer contents en masse to the destination file. Because disk drives are designed to transfer data in larger chunks, not byte-by-byte, the buffered approach greatly speeds up the transfer rate of data from a program to a file.

Opening a file for output this way creates a new file if there is no file of that name. If a file by that name exists prior to opening it for output, the act of opening it truncates it so that output starts with a clean file. Later you'll see how to open an existing file and retain its contents.

Caution

graphics/tnt.gif

Opening a file for output in the default mode automatically truncates the file to zero size, in effect disposing of the prior contents.

The requirements for reading a file are much like those for writing to a file:

Create an ifstream object to manage the input stream.
Associate that object with a particular file.
Use the object the same way you would use cin.

The steps for doing so are similar to those for writing to a file. First, of course, include the fstream header file. Then declare an ifstream object, and associate it with the filename. You can do so in two statements or one:

// two statements
ifstream fin;              // create ifstream object called fin
fin.open("jellyjar.dat");  // open jellyjar.dat for reading
// one statement
ifstream fis("jamjar.dat"); // create fis and associate with jamjar.dat

You now can use fin or fis much like cin. For example, you can do the following:

char ch;
fin >> ch;               // read a character from the jellyjar.dat file
char buf[80];
fin >> buf;              // read a word from the file
fin.getline(buf, 80);    // read a line from the file

Input, like output, is buffered, so creating an ifstream object like fin creates an input buffer which the fin object manages. As with output, buffering moves data much faster than byte-by-byte transfer.

The connections with a file are closed automatically when the input and output stream objects expire—for example, when the program terminates. Also, you can close a connection with a file explicitly by using the close() method:

fout.close();    // close output connection to file
fin.close();     // close input connection to file

Closing such a connection does not eliminate the stream; it just disconnects it from the file. However, the stream management apparatus remains in place. For example, the fin object still exists along with the input buffer it manages. As you'll see later, you can reconnect the stream to the same file or to another file.

Meanwhile, let's look at a short example. The program in Listing 17.16 asks you for a filename. It creates a file having that name, writes some information to it, and closes the file. Closing the file flushes the buffer, guaranteeing that the file is updated. Then the program opens the same file for reading and displays its contents. Note that the program uses fin and fout in the same manner as cin and cout.

Listing 17.16 `file.cpp`

// file.cpp -- save to a file
#include <iostream> // not needed for many systems
#include <fstream>
using namespace std;

int main()
{
    char filename[20];

    cout << "Enter name for new file: ";
    cin >> filename;

// create output stream object for new file and call it fout
    ofstream fout(filename);

    fout << "For your eyes only!\n";        // write to file
    cout << "Enter your secret number: ";   // write to screen
    float secret;
    cin >> secret;
    fout << "Your secret number is " << secret << "\n";
    fout.close();           // close file

// create input stream object for new file and call it fin
    ifstream fin(filename);
    cout << "Here are the contents of " << filename << ":\n";
    char ch;
    while (fin.get(ch))     // read character from file and
        cout << ch;         // write it to screen
    cout << "Done\n";
    fin.close();

    return 0;
}

Here is a sample run:

Enter name for new file: pythag
Enter your secret number: 3.14159
Here are the contents of pythag:
For your eyes only!
Your secret number is 3.14159
Done

If you check the directory containing your program, you should find a file named pythag, and any text editor should show the same contents that the program output displayed.

Opening Multiple Files

You might require that a program open more than one file. The strategy for opening multiple files depends upon how they will be used. If you need two files open simultaneously, you must create a separate stream for each file. For example, a program that collates two sorted files into a third file would create two ifstream objects for the two input files and an ofstream object for the output file. The number of files you can open simultaneously depends on the operating system, but it typically is on the order of 20.

However, you may plan to process a group of files sequentially. For example, you might want to count how many times a name appears in a set of ten files. Then you can open a single stream and associate it with each file in turn. This conserves computer resources more effectively than opening a separate stream for each file. To use this approach, declare a stream object without initializing it and then use the open() method to associate the stream with a file. For example, this is how you could handle reading two files in succession:

ifstream fin;           // create stream using default constructor
fin.open("fat.dat");    // associate stream with fat.dat file
...                     // do stuff
fin.close();            // terminate association with fat.dat
fin.clear();            // reset fin (may not be needed)
fin.open("rat.dat");    // associate stream with rat.dat file
...
fin.close();

We'll look at an example shortly, but first, let's examine a technique for feeding a list of files to a program in a manner that allows the program to use a loop to process them.

Command-Line Processing

File-processing programs often use command-line arguments to identify files. Command-line arguments are arguments that appear on the command line when you type a command. For example, to count the number of words in some files on a Unix or Linux system, you would type this command at the command-line prompt:

wc report1 report2 report3

Here wc is the program name, and report1, report2, and report3 are filenames passed to the program as command-line arguments.

C++ has a mechanism for letting a program access command-line arguments. Use the following alternative function heading for main():

int main(int argc, char *argv[])

The argc argument represents the number of arguments on the command line. The count includes the command name itself. The argv variable is a pointer to a pointer to a char. This sounds a bit abstract, but you can treat argv as if it were an array of pointers to the command-line arguments, with argv[0] being a pointer to the first character of a string holding the command name, argv[1] being a pointer to the first character of a string holding the first command-line argument, and so on. That is, argv[0] is the first string from the command line, and so on. For example, suppose you have the following command line:

wc report1 report2 report3

Then argc would be 4, argv[0] would be wc, argv[1] would be report1, and so on. The following loop would print each command-line argument on a separate line:

for (int i = 1; i < argc; i++)
       cout << argv[i] << endl;

Starting with i = 1 just prints the command-line arguments; starting with i = 0 would also print the command name.

Command-line arguments, of course, go hand-in-hand with command-line operating systems like DOS, Unix, and Linux. Other setups may still allow you to use command-line arguments:

Many DOS and Windows IDEs (integrated development environments) have an option for providing command-line arguments. Typically, you have to navigate through a series of menu choices leading to a box into which you can type the command-line arguments. The exact set of steps varies from vendor to vendor and from upgrade to upgrade, so check your documentation.
DOS IDEs and many Windows IDEs can produce executable files that run under DOS or in a DOS window in the usual DOS command-line mode.
Under Metrowerks CodeWarrior for the Macintosh, you can simulate command-line arguments by placing the following code in your program:
```
...
#include <console.h> // for emulating command-line arguments
int main(int argc, char * argv[])
{
    argc = ccommand(&argv); // yes, ccommand, not command
    ...
```
When you run the program, the ccommand() function places a dialog box onscreen with a box in which you can type the command-line arguments. It also lets you simulate redirection.

Listing 17.17 combines the command-line technique with file stream techniques to count characters in those files listed on the command line.

Listing 17.17 `count.cpp`

// count.cpp -- count characters in a list of files
#include <iostream>
using namespace std;
#include <fstream>
#include <cstdlib>          // or stdlib.h
// #include <console.h>     // for Macintosh
int main(int argc, char * argv[])
{
    // argc = ccommand(&argv);      // for Macintosh
    if (argc == 1)          // quit if no arguments
    {
        cerr << "Usage: " << argv[0] << " filename[s]\n";
        exit(1);
    }

    ifstream fin;              // open stream
    long count;
    long total = 0;
    char ch;

    for (int file = 1; file < argc; file++)
    {
        fin.open(argv[file]);  // connect stream to argv[file]
        count = 0;
        while (fin.get(ch))
            count++;
        cout << count << " characters in " << argv[file] << "\n";
        total += count;
        fin.clear();           // needed for some implementations
        fin.close();           // disconnect file
    }
    cout << total << " characters in all files\n";

    return 0;
}

Compatibility Note

graphics/hands.gif

Some implementations require using fin.clear() while others do not. It depends on whether associating a new file with the fstream object automatically resets the stream state or not. In does no harm to use fin.clear() even if it isn't needed.

On a DOS system, for example, you could compile Listing 17.17 to an executable file called count.exe. Then sample runs could look like this:

C>count
Usage: c:\count.exe filename[s]
C>count paris rome
3580 characters in paris
4886 characters in rome
8466 characters in all files
C>

Note that the program uses cerr for the error message. A minor point is that the message uses argv[0] instead of count.exe:

cerr << "Usage: " << argv[0] << " filename[s]\n";

This way, if you change the name of the executable file, the program will automatically use the new name.

Suppose you pass a bogus filename to the count program. Then the input statement fin.get(ch) will fail, terminating the while loop immediately, and the program will report 0 characters. But you can modify the program to test whether it succeeded in linking the stream to a file. That's one of the matters we'll take up next.

Stream Checking and `is_open()`

The C++ file stream classes inherit a stream-state member from the ios_base class. This member, as discussed earlier, stores information reflecting the stream status: all is well, end-of-file has been reached, I/O operation failed, and so on. If all is well, the stream state is zero (no news is good news). The various other states are recorded by setting particular bits to 1. The file stream classes also inherit the ios_base methods that report about the stream state and that were summarized earlier in Table 17.5. You can monitor conditions with these stream-state methods. For example, you can use the good() method to see that all the stream state bits are clear. However, newer C++ implementations have a better way to check if a file has been opened—the is_open() method. You can modify the program in Listing 17.17 so that it reports bogus filenames and then skips to the next file by adding a call to fin.is_open() to the for loop as follows:

for (int file = 1; file < argc; file++)
{
    fin.open(argv[file]);

// Add this
    if (!fin.is_open())
    {
        cerr << "Couldn't open file " << argv[file] << "\n";
        fin.clear();   // reset failbit
        continue;
    }
// End of addition
    count = 0;
    while (fin.get(ch))
        count++;
    cout << count << " characters in " << argv[file] << "\n";
    total += count;
    fin.clear();
    fin.close();    // disconnect file
}

The fin.is_open() call returns false if the fin.open() call fails. In that case, the program warns you of its problem, and the continue statement causes the program to skip the rest of the for loop cycle and start with the next cycle.

Caution

graphics/tnt.gif

In the past, the usual tests for successful opening of a file were the following:

if(!fin.good()) ... // failed to open
if (!fin) ...       // failed to open

The fin object, when used in a test condition, is converted to false if fin.good() is false and to true otherwise, so the two forms are equivalent. However, these tests fail to detect one circumstance, which is attempting to open a file using an inappropriate file mode (see the File Modes section). The is_open() method catches this form of error along with those caught by the good() method. However, older implementations do not have is_open().

File Modes

The file mode describes how a file is to be used: read it, write to it, append it, and so on. When you associate a stream with a file, either by initializing a file stream object with a filename or by using the open() method, you can provide a second argument specifying the file mode:

ifstream fin("banjo", mode1);  // constructor with mode argument
ofstream fout();
fout.open("harp", mode2);      // open() with mode arguments

The ios_base class defines an openmode type to represent the mode; like the fmtflags and iostate types, it is a bitmask type. (In the old days, it was type int.) You can choose from several constants defined in the ios_base class to specify the mode. Table 17.8 lists the constants and their meanings. C++ file I/O has undergone several changes to make it compatible with ANSI C file I/0.

Table 17.8. File Mode Constants
Constant	Meaning
`ios_base::in`	Open file for reading.
`ios_base::out`	Open file for writing.
`ios_base::ate`	Seek to end-of-file upon opening file.
`ios_base::app`	Append to end-of-file.
`ios_base::trunc`	Truncate file if it exists.
`ios_base::binary`	Binary file.

If the ifstream and ofstream constructors and the open() methods each take two arguments, how have we gotten by using just one in the previous examples? As you probably have guessed, the prototypes for these class member functions provide default values for the second argument (the file mode argument). For example, the ifstream open() method and constructor use ios_base::in (open for reading) as the default value for the mode argument, while the ofstream open() method and constructor use ios_base::out | ios_base::trunc (open for writing and truncate the file) as the default. The bitwise OR operator (|) is used to combine two bit-values into a single value that can be used to set both bits. The fstream class doesn't provide a mode default, so you have to provide a mode explicitly when creating an object of that class.

Note that the ios_base::trunc flag means an existing file is truncated when opened to receive program output; that is, its previous contents are discarded. While this behavior commendably minimizes the danger of running out of disk space, you probably can imagine situations in which you don't want to wipe out a file when you open it. C++, of course, provides other choices. If, for example, you want to preserve the file contents and add (append) new material to the end of the file, you can use the ios_base::app mode:

ofstream fout("bagels", ios_base::out | ios_base::app);

Again, the code uses the | operator to combine modes. So ios_base::out | ios_base::app means to invoke both the out mode and the app mode (see Figure 17.6).

Figure 17.6. Some file-opening modes.

graphics/17fig06.gif

Expect to find some differences among older implementations. For example, some allow you to omit the ios_base::out in the previous example, and some don't. If you aren't using the default mode, the safest approach is to provide all the mode elements explicitly. Some compilers don't support all the choices in Table 17.7, and some may offer choices beyond those in the table. One consequence of these differences is that you may have to make some alterations in the following examples to do them on your system. The good news is that the development of the C++ standard is providing greater uniformity.

Standard C++ defines parts of file I/O in terms of ANSI C standard I/O equivalents. A C++ statement like

ifstream fin(filename, c++mode);

is implemented as if it uses the C fopen() function:

fopen(filename, cmode);

Here c++mode is a type openmode value, such as ios_base::in, and cmode is the corresponding C mode string, such as "r". Table 17.9 shows the correspondence between C++ modes and C modes. Note that ios_base::out by itself causes truncation but that it doesn't cause truncation when combined with ios_base::in. Unlisted combinations, such as ios_base::in [vn] ios_base::trunc, prevent the file from being opened. The is_open() method detects this failure.

Table 17.9. C++ and C File-Opening Modes
C++ mode	C mode	Meaning
`ios_base::in`	`"r"`	Open for reading.
`ios_base::out`	`"w"`	(Same as `ios_base::out \| ios_base::trunc`).
`ios_base::out \| ios_base::trunc`	`"w"`	Open for writing, truncating file if it already exists.
`ios_base::out \| ios_base::app`	`"a"`	Open for writing, append only.
`ios_base::in \| ios_base::out`	`"r+"`	Open for reading and writing, with writing permitted anywhere in the file.
`ios_base::in \| ios_base ::out \| ios_base::trunc`	`"w+"`	Open for reading and writing, first truncating file if it already exists.
`c++mode` `\| ios_base::binary`	`"cmodeb"`	Open in `c++mode` or corresponding `cmode` and in binary mode; for example, `ios_base::in \|` `ios_base::binary` becomes "rb".
`c++mode` `\| ios_base::ate`	`"cmode"`	Open in indicated mode and go to end of file. C uses a separate function call instead of a mode code. For example, `ios_base::in \|` `ios_base::ate` translates to the mode and the C function call `fseek(file, 0, SEEK_END)`.

Note that both ios_base::ate and ios_base::app place you (or, more precisely, a file pointer) at the end of the file just opened. The difference between the two is that the ios_base::app mode allows you to add data to the end of the file only, while the ios_base::ate mode merely positions the pointer at the end of the file.

Clearly, there are many possible combinations of modes. We'll look at a few representative ones.

Appending to a File

Let's begin with a program that appends data to the end of a file. The program will maintain a file containing a guest list. When the program begins, it will display the current contents of the file, if it exists. It can use the is_open() method after attempting to open the file to check if the file exists. Next, the program will open the file for output using the ios_base::app mode. Then it will solicit input from the keyboard to add to the file. Finally, the program will display the revised file contents. Listing 17.18 illustrates how to accomplish these goals. Note how the program uses the is_open() method to test if the file has been opened successfully.

Compatibility Note

graphics/hands.gif

File I/O was perhaps the least standardized aspect of C++ in its earlier days, and many older compilers don't quite conform to the current standard. Some, for example, used modes such as nocreate that are not part of the current standard. Also, only some compilers require the fin.clear() call before opening the same file a second time for reading.

Listing 17.18 `append.cpp`

// append.cpp -- append information to a file
#include <iostream>
using namespace std;
#include <fstream>
#include <cstdlib>      // (or stdlib.h) for exit()

const char * file = "guests.dat";
const int Len = 40;
int main()
{
    char ch;

// show initial contents
    ifstream fin;
    fin.open(file);

    if (fin.is_open())
    {
        cout << "Here are the current contents of the "
             << file << " file:\n";
        while (fin.get(ch))
            cout << ch;
    }
    fin.close();

// add new names
    ofstream fout(file, ios::out | ios::app);
    if (!fout.is_open())
    {
        cerr << "Can't open " << file << " file for output.\n";
        exit(1);
    }

    cout << "Enter guest names (enter a blank line to quit):\n";
    char name[Len];
    cin.get(name, Len);
    while (name[0] != '\0')
    {
        while (cin.get() != '\n')
            continue;   // get rid of \n and long lines
        fout << name << "\n";
        cin.get(name, Len);
    }
    fout.close();

// show revised file
    fin.clear();    // not necessary for some compilers
    fin.open(file);
    if (fin.is_open())
    {
        cout << "Here are the new contents of the "
             << file << " file:\n";
        while (fin.get(ch))
            cout << ch;
    }
    fin.close();
    cout << "Done.\n";
    return 0;
}

Here's a sample first run. At this point the guests.dat file hasn't been created, so the program doesn't preview the file.

Enter guest names (enter a blank line to quit):
Sylvester Ballone
Phil Kates
Bill Ghan

Here are the new contents of the guests.dat file:
Sylvester Ballone
Phil Kates
Bill Ghan
Done.

Next time the program is run, however, the guests.dat file does exist, so the program does preview the file. Also, note that the new data are appended to the old file contents rather than replacing them.

Here are the current contents of the guests.dat file:
Sylvester Ballone
Phil Kates
Bill Ghan
Enter guest names (enter a blank line to quit):
Greta Greppo
LaDonna Mobile
Fannie Mae

Here are the new contents of the guests.dat file:
Sylvester Ballone
Phil Kates
Bill Ghan
Greta Greppo
LaDonna Mobile
Fannie Mae
Done.

You should be able to read the contents of guest.dat with any text editor, including the editor you use to write your source code.

Binary Files

When you store data in a file, you can store the data in text form or in binary format. Text form means you store everything as text, even numbers. For example, storing the value -2.324216e+07 in text form means storing the 13 characters used to write this number. That requires converting the computer's internal representation of a floating-point number to character form, and that's exactly what the << insertion operator does. Binary format, however, means storing the computer's internal representation of a value. That is, instead of storing characters, store the (typically) 64-bit double representation of the value. For a character, the binary representation is the same as the text representation—the binary representation of the character's ASCII code (or equivalent). For numbers, however, the binary representation is much different from the text representation (see Figure 17.7).

Figure 17.7. Binary and text representation of a floating-point number.

graphics/17fig07.gif

Each format has its advantages. The text format is easy to read. You can use an ordinary editor or word processor to read and edit a text file. You easily can transfer a text file from one computer system to another. The binary format is more accurate for numbers, because it stores the exact internal representation of a value. There are no conversion errors or round-off errors. Saving data in binary format can be faster because there is no conversion and because you may be able to save data in larger chunks. And the binary format usually takes less space, depending upon the nature of the data. Transferring to another system can be a problem, however, if the new system uses a different internal representation for values. Even different compilers on the same system may use different internal representations. In these cases, you (or someone) may have to write a program to translate one data format to another.

Let's look at a more concrete example. Consider the following structure definition and declaration:

struct planet
{
    char name[20];       // name of planet
    double population;   // its population
    double g;            // its acceleration of gravity
};
planet pl;

To save the contents of the structure pl in text form, you can do this:

ofstream fout("planets.dat", ios_base::app);
fout << pl.name << " " << pl.population << " " << pl.g << "\n";

Note that you have to provide each structure member explicitly by using the membership operator, and you have to separate adjacent data for legibility. If the structure contained, say, 30 members, this could get tedious.

To save the same information in binary format, you can do this:

ofstream fout("planets.dat", ios_base::app | ios_base::binary);
fout.write( (char *) &pl, sizeof pl);

This code saves the entire structure as a single unit, using the computer's internal representation of data. You won't be able to read the file as text, but the information will be stored more compactly and precisely than as text. And it certainly is easier to type the code. This approach made two changes:

It used a binary file mode.
It used the write() member function.

Let's examine these changes more closely.

Some systems, such as DOS, support two file formats: text and binary. If you want to save data in binary form, you'd best use the binary file format. In C++ you do so by using the ios_base::binary constant in the file mode. If you want to know why you should do this on a DOS system, check the discussion in the following note on Binary Files and Text Files.

Binary Files and Text Files

graphics/common.gif

Using a binary file mode causes a program to transfer data from memory to a file, or vice versa, without any hidden translation taking place. Such is not necessarily the case for the default text mode. For example, consider DOS text files. They represent a newline with a two-character combination: carriage return, linefeed. Macintosh text files represent a newline with a carriage return. Unix and Linux files represent a newline with a linefeed. C++, which grew up on Unix, also represents a newline with a linefeed. For portability, a DOS C++ program automatically translates the C++ newline to a carriage return, linefeed when writing to a text mode file; and a Macintosh C++ program translates the newline to a carriage return when writing to a file. When reading a text file, these programs convert the local newline back to the C++ form. The text format can cause problems with binary data, for a byte in the middle of a double value could have the same bit pattern as the ASCII code for the newline character. Also there are differences in how end-of-file is detected. So you should use the binary file mode when saving data in binary format. (Unix systems have just one file mode, so on them the binary mode is the same as the text mode.)

To save data in binary form instead of text form, you can use the write() member function. This method, recall, copies a specified number of bytes from memory to a file. We used it earlier to copy text, but it will copy any type data byte-by-byte with no conversion. For example, if you pass it the address of a long variable and tell it to copy 4 bytes, it will copy the 4 bytes constituting the long value verbatim to a file and not convert it to text. The only awkwardness is that you have to type cast the address to type pointer-to-char. You can use the same approach to copy an entire planet structure. To get the number of bytes, use the sizeof operator:

fout.write( (char *) &pl, sizeof pl);

This statement goes to the address of the pl structure and copies the 36 bytes (the value of sizeof pl expression) beginning at this address to the file connected to fout.

To recover the information from a file, use the corresponding read() method with an ifstream object:

ifstream fin("planets.dat", ios_base::binary);
fin.read((char *) &pl, sizeof pl);

This copies sizeof pl bytes from the file to the pl structure. This same approach can be used with classes that don't use virtual functions. In that case, just the data members are saved, not the methods. If the class does have virtual methods, then a hidden pointer to a table of pointers to virtual functions also is copied. Because the next time you run the program it might locate the virtual function table at a different location, copying old pointer information into objects from a file can create havoc. (Also, see the note in Programming Exercise 6.)

Tip

graphics/bulb.gif

The read() and write() member functions complement each other. Use read() to recover data that has been written to a file with write().

Listing 17.19 uses these methods to create and read a binary file. In form, the program is similar to Listing 17.18, but it uses write() and read() instead of the insertion operator and the get() method. It also uses manipulators to format the screen output.

Compatibility Note

graphics/hands.gif

Although the binary file concept is part of ANSI C, some C and C++ implementations do not provide support for the binary file mode. The reason for this oversight is that some systems only have one file type in the first place, so you can use binary operations such as read() and write() with the standard file format. Therefore, if your implementation rejects ios_base::binary as a valid constant, just omit it from your program. If your implementation doesn't support the fixed and right manipulators, you can use cout.setf(ios_base::fixed, ios_base::floatfield) and cout.setf(ios_base::right, ios_base::adjustfield). Also, you may have to substitute ios for ios_base. Other compilers, particularly older ones, may have other idiosyncrasies.

Listing 17.19 `binary.cpp`

// binary.cpp -- binary file I/O
#include <iostream> // not required by most systems
using namespace std;
#include <fstream>
#include <iomanip>
#include <cstdlib>  // (or stdlib.h) for exit()

inline void eatline() { while (cin.get() != '\n') continue; }
struct planet
{
    char name[20];      // name of planet
    double population;  // its population
    double g;           // its acceleration of gravity
};

const char * file = "planets.dat";

int main()
{
    planet pl;
    cout << fixed << right;

// show initial contents
    ifstream fin;
    fin.open(file, ios::in |ios::binary);   // binary file
    //NOTE: some systems don't accept the ios::binary mode
    if (fin.is_open())
    {
    cout << "Here are the current contents of the "
        << file << " file:\n";
    while (fin.read((char *) &pl, sizeof pl))
        {
        cout << setw(20) << pl.name << ": "
              << setprecision(0) << setw(12) << pl.population
              << setprecision(2) << setw(6) << pl.g << "\n";
        }
    }
    fin.close();

// add new data
    ofstream fout(file, ios::out | ios::app | ios::binary);
    //NOTE: some systems don't accept the ios::binary mode
    if (!fout.is_open())
    {
        cerr << "Can't open " << file << " file for output:\n";
        exit(1);
    }

    cout << "Enter planet name (enter a blank line to quit):\n";
    cin.get(pl.name, 20);
    while (pl.name[0] != '\0')
    {
        eatline();
        cout << "Enter planetary population: ";
        cin >> pl.population;
        cout << "Enter planet's acceleration of gravity: ";
        cin >> pl.g;
        eatline();
        fout.write((char *) &pl, sizeof pl);
        cout << "Enter planet name (enter a blank line "
                "to quit):\n";
        cin.get(pl.name, 20);
    }
    fout.close();

// show revised file
    fin.clear();    // not required for some implementations, but won't hurt
    fin.open(file, ios::in | ios::binary);
    if (fin.is_open())
    {
    cout << "Here are the new contents of the "
        << file << " file:\n";
    while (fin.read((char *) &pl, sizeof pl))
        {
        cout << setw(20) << pl.name << ": "
              << setprecision(0) << setw(12) << pl.population
              << setprecision(2) << setw(6) << pl.g << "\n";
        }
    }
    fin.close();
    cout << "Done.\n";
    return 0;
}

Here is a sample initial run:

Enter planet name (enter a blank line to quit):
Earth
Enter planetary population: 5962000000
Enter planet's acceleration of gravity: 9.81
Enter planet name (enter a blank line to quit):

Here are the new contents of the planets.dat file:
               Earth:   5932000000  9.81
Done.

And here is a sample follow-up run:

Here are the current contents of the planets.dat file:
               Earth:   5932000000  9.81
Enter planet name (enter a blank line to quit):
Bill's Planet
Enter planetary population: 23020020
Enter planet's acceleration of gravity: 8.82
Enter planet name (enter a blank line to quit):

Here are the new contents of the planets.dat file:
               Earth:   5932000000   9.81
       Bill's Planet:   23020020     8.82
Done.

You've already seen the major features of the program, but let's reexamine an old point. The program uses this code (in the form of the inline eatline() function) after reading the planet's g value:

while (cin.get() != '\n') continue;

This reads and discards input up through the newline character. Consider the next input statement in the loop:

cin.get(pl.name, 20);

If the newline had been left in place, this statement would read the newline as an empty line, terminating the loop.

Random Access

For our last example, let's look at random access. This means moving directly to any location in the file instead of moving through it sequentially. The random access approach is often used with database files. A program will maintain a separate index file giving the location of data in the main data file. Then it can jump directly to that location, read the data there, and perhaps modify it. This approach is done most simply if the file consists of a collection of equal-sized records. Each record represents a related collection of data. For example, in the preceding example, each file record would represent all the data about a particular planet. A file record corresponds rather naturally to a program structure or class.

We'll base the example on the binary file program in Listing 17.19, taking advantage of the fact that the planet structure provides a pattern for a file record. To add to the creative tension of programming, the example will open the file in a read-and-write mode so that it can both read and modify a record. You can do this by creating an fstream object. The fstream class derives from the iostream class, which, in turn, is based on both istream and ostream classes, so it inherits the methods of both. It also inherits two buffers, one for input and one for output, and synchronizes the handling of the two buffers. That is, as the program reads the file or writes to it, it moves both an input pointer in the input buffer and an output pointer in the output buffer in tandem.

The example will do the following:

Display the current contents of the planets.dat file.
Ask which record you want to modify.
Modify that record.
Show the revised file.

A more ambitious program would use a menu and a loop to let you select from this list of actions indefinitely, but this version will perform each action just once. This simplified approach allows you to examine several aspects of read-write files without getting bogged down in matters of program design.

Caution

graphics/tnt.gif

This program assumes that the planets.dat file already exists and was created by the binary.cpp program.

The first question to answer is what file mode to use. In order to read the file, you need the ios_base::in mode. For binary I/O, you need the ios_base::binary mode. (Again, on some non-standard systems you can omit, indeed, you may have to omit, this mode.) In order to write to the file, you need the ios_base::out or the ios_base::app mode. However, the append mode allows a program to add data to the end of the file only. The rest of the file is read-only; that is, you can read the original data, but not modify it—so you have to use ios_base::out. As Table 17.9 indicates, using the in and out modes simultaneously provided a read/write mode, so you just have to add the binary element. As mentioned earlier, you use the | operator to combine modes. Thus, you need the following statement to set up business:

finout.open(file,ios_base::in | ios_base::out | ios_base::binary);

Next, you need a way to move through a file. The fstream class inherits two methods for this: seekg() moves the input pointer to a given file location, and seekp() moves the output pointer to a given file location. (Actually, because the fstream class uses buffers for intermediate storage of data, the pointers point to locations in the buffers, not in the actual file.) You also can use seekg() with an ifstream object and seekp() with an ostream object. Here are the seekg() prototypes:

basic_istream<charT,traits>& seekg(off_type, ios_base::seekdir);
basic_istream<charT,traits>& seekg(pos_type);

As you can see, they are templates. This chapter will use a template specialization for the char type. For the char specialization, the two prototypes are equivalent to the following:

istream & seekg(streamoff, ios_base::seekdir);
istream & seekg(streampos);

The first prototype represents locating a file position measured, in bytes, as an offset from a file location specified by the second argument. The second prototype represents locating a file position measured in bytes from the beginning of a file.

Type Escalation

graphics/common.gif

When C++ was young, life was simpler for the seekg() methods. The streamoff and streampos types were typedefs for some standard integer type, such as long. However, the quest for creating a portable standard had to deal with the realization that an integer argument might not provide enough information for some file systems, so streamoff and streampos were allowed to be structure or class types so long as they allowed some basic operations, such as using an integer value as an initialization value. Next, the old istream class was replaced with the basic_istream template, and streampos and streamoff were replaced with template-based types pos_type and off_type. However, streampos and streamoff continue to exist as char specializations of pos_type and off_type. Similarly, you can use wstreampos and wstreamoff types if you use seekg() with a wistream object.

Let's take a look at the arguments to the first prototype of seekg(). Values of the streamoff type are used to measure offsets, in bytes, from a particular location in a file. The streamoff argument represents the file position in bytes measured as an offset from one of three locations. (The type may be defined as an integral type or as a class.) The seek_dir argument is another integer type defined, along with three possible values, in the ios_base class. The constant ios_base::beg means measure the offset from the beginning of the file. The constant ios_base::cur means measure the offset from the current position. The constant ios_base::end means measure the offset from the end of the file. Here are some sample calls, assuming fin is an ifstream object:

fin.seekg(30, ios_base::beg);    // 30 bytes beyond the beginning
fin.seekg(-1, ios_base::cur);    // back up one byte
fin.seekg(0, ios_base::end);     // go to the end of the file

Now let's look at the second prototype. Values of the streampos type locate a position in a file. It can be a class, but, if so, the class includes a constructor with a streamoff argument and a constructor with an integer argument, providing a path to convert both types to streampos values. A streampos value represents an absolute location in a file measured from the beginning of the file. You can treat a streampos position as if it measures a file location in bytes from the beginning of a file, with the first byte being byte 0. So the statement

fin.seekg(112);

locates the file pointer at byte 112, which would be the 113th byte in the file. If you want to check the current position of a file pointer, you can use the tellg() method for input streams and the tellp() methods for output streams. Each returns a streampos value representing the current position, in bytes, measured from the beginning of the file. When you create an fstream object, the input and output pointers move in tandem, so tellg() and tellp() return the same value. But if you use an istream object to manage the input stream and an ostream object to manage the output stream to the same file, the input and output pointers move independently of one another, and tellg() and tellp() can return different values.

You can then use seekg() to go to the file beginning. Here is a section of code that opens the file, goes to the beginning, and displays the file contents:

fstream finout;     // read and write streams
finout.open(file,ios::in | ios::out |ios::binary);
//NOTE: Some UNIX systems require omitting | ios::binary
int ct = 0;
if (finout.is_open())
{
    finout.seekg(0);    // go to beginning
    cout << "Here are the current contents of the "
          << file << " file:\n";
    while (finout.read((char *) &pl, sizeof pl))
    {
        cout << ct++ << ": " << setw(20) << pl.name << ": "
        << setprecision(0) << setw(12) << pl.population
        << setprecision(2) << setw(6) << pl.g << "\n";
    }
    if (finout.eof())
        finout.clear(); // clear eof flag
    else
    {
        cerr << "Error in reading " << file << ".\n";
        exit(1);
    }
}
else
{
    cerr << file << " could not be opened -- bye.\n";
    exit(2);
}

This is similar to the start of Listing 17.19, but there are some changes and additions. First, as just described, the program uses an fstream object with a read-write mode, and it uses seekg() to position the file pointer at the start of the file. (This isn't really needed for this example, but it shows how seekg() is used.) Next, the program makes the minor change of numbering the records as they are displayed. Then it makes the following important addition:

if (finout.eof())
    finout.clear(); // clear eof flag
else
{
    cerr << "Error in reading " << file << ".\n";
    exit(1);
}

The problem is that once the program reads and displays the entire file, it sets the eofbit element. This convinces the program that it's finished with the file and disables any further reading of or writing to the file. Using the clear() method resets the stream state, turning off eofbit. Now the program can once again access the file. The else part handles the possibility that the program quit reading the file for some reason other than reaching the end-of-file, such as a hardware failure.

The next step is to identify the record to be changed and then change it. To do this, the program asks the user to enter a record number. Multiplying the number by the number of bytes in a record yields the byte number for the beginning of the record. If record is the record number, the desired byte number is record * sizeof pl:

cout << "Enter the record number you wish to change: ";
long rec;
cin >> rec;
eatline();              // get rid of newline
if (rec < 0 || rec >= ct)
{
    cerr << "Invalid record number -- bye\n";
    exit(3);
}
streampos place = rec * sizeof pl;  // convert to streampos type
finout.seekg(place);    // random access

The variable ct represents the number of records; the program exits if you try to go beyond the limits of the file.

Next, the program displays the current record:

finout.read((char *) &pl, sizeof pl);
cout << "Your selection:\n";
cout << rec << ": " << setw(20) << pl.name << ": "
<< setprecision(0) << setw(12) << pl.population
<< setprecision(2) << setw(6) << pl.g << "\n";
if (finout.eof())
    finout.clear();     // clear eof flag

After displaying the record, the program lets you change the record:

cout << "Enter planet name: ";
cin.get(pl.name, 20);
eatline();
cout << "Enter planetary population: ";
cin >> pl.population;
cout << "Enter planet's acceleration of gravity: ";
cin >> pl.g;
finout.seekp(place);    // go back
finout.write((char *) &pl, sizeof pl) << flush;

if (finout.fail())
{
    cerr << "Error on attempted write\n";
    exit(5);
}

The program flushes the output to guarantee that the file is updated before proceeding to the next stage.

Finally, to display the revised file, the program uses seekg() to reset the file pointer to the beginning. Listing 17.20 shows the complete program. Don't forget that it assumes that a planets.dat file created using the binary.cpp program is available.

Compatibility Note

graphics/hands.gif

The older the implementation, the more likely it is to run afoul of the standard. Some systems don't recognize the binary flag, the fixed and right manipulators, and ios_base. Symantec C++ appends the new input instead of replacing the indicated record. Also, Symantec C++ requires replacing (twice)

while (fin.read((char *) &pl, sizeof pl))

with the following:

while (fin.read((char *) &pl, sizeof pl) && !fin.eof())

Listing 17.20 `random.cpp`

// random.cpp -- random access to a binary file
#include <iostream>     // not required by most systems
using namespace std;
#include <fstream>
#include <iomanip>
#include <cstdlib>      // (or stdlib.h) for exit()

struct planet
{
    char name[20];      // name of planet
    double population;  // its population
    double g;           // its acceleration of gravity
};

const char * file = "planets.dat";  // ASSUMED TO EXIST (binary.cpp example)
inline void eatline() { while (cin.get() != '\n') continue; }

int main()
{
    planet pl;
    cout << fixed;

// show initial contents
    fstream finout;     // read and write streams
    finout.open(file,ios::in | ios::out |ios::binary);
    //NOTE: Some UNIX systems require omitting | ios::binary
    int ct = 0;
    if (finout.is_open())
    {
        finout.seekg(0);    // go to beginning
        cout << "Here are the current contents of the "
              << file << " file:\n";
        while (finout.read((char *) &pl, sizeof pl))
        {
            cout << ct++ << ": " << setw(20) << pl.name << ": "
            << setprecision(0) << setw(12) << pl.population
            << setprecision(2) << setw(6) << pl.g << "\n";
        }
        if (finout.eof())
            finout.clear(); // clear eof flag
        else
        {
            cerr << "Error in reading " << file << ".\n";
            exit(1);
        }
    }
    else
    {
        cerr << file << " could not be opened -- bye.\n";
        exit(2);
    }

// change a record
    cout << "Enter the record number you wish to change: ";
    long rec;
    cin >> rec;
    eatline();              // get rid of newline
    if (rec < 0 || rec >= ct)
    {
        cerr << "Invalid record number -- bye\n";
        exit(3);
    }
    streampos place = rec * sizeof pl;  // convert to streampos type
    finout.seekg(place);    // random access
    if (finout.fail())
    {
        cerr << "Error on attempted seek\n";
        exit(4);
    }

    finout.read((char *) &pl, sizeof pl);
    cout << "Your selection:\n";
    cout << rec << ": " << setw(20) << pl.name << ": "
    << setprecision(0) << setw(12) << pl.population
    << setprecision(2) << setw(6) << pl.g << "\n";
    if (finout.eof())
        finout.clear();     // clear eof flag

    cout << "Enter planet name: ";
    cin.get(pl.name, 20);
    eatline();
    cout << "Enter planetary population: ";
    cin >> pl.population;
    cout << "Enter planet's acceleration of gravity: ";
    cin >> pl.g;
    finout.seekp(place);    // go back
    finout.write((char *) &pl, sizeof pl) << flush;
    if (finout.fail())
    {
        cerr << "Error on attempted write\n";
        exit(5);
    }

// show revised file
    ct = 0;
    finout.seekg(0);            // go to beginning of file
    cout << "Here are the new contents of the " << file
         << " file:\n";
    while (finout.read((char *) &pl, sizeof pl))
    {
        cout << ct++ << ": " << setw(20) << pl.name << ": "
              << setprecision(0) << setw(12) << pl.population
               << setprecision(2) << setw(6) << pl.g << "\n";
    }
    finout.close();
    cout << "Done.\n";
    return 0;
}

Here's a sample run based on a planets.dat file that has had a few more entries added since you last saw it:

Here are the current contents of the planets.dat File:
0:     Earth:         5333000000         9.81
1:     Bill's Planet: 23020020           8.82
2:     Trantor:       58000000000        15.03
3:     Trellan:       4256000            9.62
4:     Freestone:     3845120000         8.68
5:     Taanagoot:     350000002          10.23
6:     Marin:         232000             9.79
Enter the record number you wish to change: 2
Your selection:
2:     Trantor:       58000000000        15.03
Enter planet name: Trantor
Enter planetary population: 59500000000
Enter planet's acceleration of gravity:  10.53
Here are the new contents of the planets.dat file:
0:     Earth:         5333000000         9.81
1:     Bill's Planet: 23020020           8.82
2:     Trantor:       59500000000        10.53
3:     Trellan:       4256000            9.62
4:     Freestone:     3845120000         8.68
5:     Taanagoot:     350000002          10.23
6:     Marin:        232000              9.79
Done.

Using the techniques in this program, you can extend it to allow you to add new material and delete records. If you were to expand the program, it would be a good idea to reorganize it by using classes and functions. For example, you could convert the planet structure to a class definition; then overload the << insertion operator so that cout << pl displays the class data members formatted as in the example.

Real World Note: Working with Temporary Files

graphics/common.gif

Developing applications oftentimes requires the use of temporary files whose lifetimes are transient and must be controlled by the program. Have you ever thought about how to go about this in C++? Well it's really quite easy to create a temporary file, copy the contents of another file, and delete the file. First of all, you will need to come up with a naming scheme for your temporary file(s), but wait, how can you ensure that each file is assigned a unique name. Well, the tmpnam() standard function declared in cstdio has got you covered.

char* tmpnam( char* pszName );

The tmpnam() function creates a temporary name and places it in the C-style string pointed to by pszName. The constants L_tmpnam and TMP_MAX, both defined in cstdio, limit the number characters in the filename and the maximum number of times tmpnam() can be called without generating a duplicate filename in the current directory. The following sample generates ten temporary names.

#include <cstdio>
#include <iostream>
using namespace std;

int main()
{
  cout << "This system can generate up to " << TMP_MAX
       << " temporary names of up to " << L_tmpnam
       << " characters.\n";
  char pszName[ L_tmpnam ] = {'\0'};
  cout << "Here are ten names:\n";
  for( int i=0; 10 > i; i++ )
  {
    tmpnam( pszName );
    cout << pszName << endl;
  }
  return 0;
}

More generally, using tmpnam(), we can now generate TMP_NAM unique filenames with up to L_tmpnam characters per name.

Incore Formatting

The iostream family supports I/O between the program and a terminal. The fstream family uses the same interface to provide I/O between a program and a file. The C++ library also provides an sstream family that uses the same interface to provide I/O between a program and a string object. That is, you can use the same ostream methods you've used with cout to write formatted information into a string object, and you can use istream methods such as getline() to read information from a string object. The process of reading formatted information from a string object or of writing formatted information to a string object is termed incore formatting. Let's take a brief look at these facilities. (The sstream family of string support supersedes a strstream.h family of char-array support.)

The sstream header file defines an ostringstream class derived from the ostream class. (There also is a wostringstream class based on wostream for wide character sets.) If you create an ostringstream object, you can write information to it, which it stores. You can use the same methods with an ostringstream object that you can with cout. That is, you can do something like the following:

ostringstream outstr;
double price = 55.00;
char * ps = " for a copy of the draft C++ standard!";
outstr.precision(2);
outstr << fixed;
outstr << "Pay only $" << price << ps << end;

The formatted text goes into a buffer, and the object uses dynamic memory allocation to expand the buffer size as needed. The ostringstream class has a member function, called str(), which returns a string object initialized to the buffer's contents:

string mesg = outstr.str();    // returns string with formatted information

Using the str() method "freezes" the object, and you no longer can write to it.

Listing 17.21 provides a short example.

Listing 17.21 `strout.cpp`

// strout.cpp -- incore formatting (output)
#include <iostream>
using namespace std;
#include <sstream>
#include <string>
int main()
{
    ostringstream outstr;   // manages a string stream

    string hdisk;
    cout << "What's the name of your hard disk? ";
    getline(cin, hdisk);
    int cap;
    cout << "What's its capacity in GB? ";
    cin >> cap;
    // write formatted information to string stream
    outstr << "The hard disk " << hdisk << " has a capacity of "
            << cap << " megabytes.\n";
    string result = outstr.str();   // save result
    cout << result;                 // show contents

    return 0;
}

Here's a sample run:

What's the name of your hard disk? Spinar
What's its capacity in GB? 72
The hard disk Spinar has a capacity of 72 megabytes.

The istringstream class lets you use the istream family of methods to read data from an istringstream object, which can be initialized from a string object. Suppose facts is a string object. To create an istringstream object associated with this string, do the following:

istringstream instr(facts);     // use facts to initialize stream

Then you use istream methods to read data from instr. For example, if instr contained a bunch of integers in character format, you could read them as follows:

int n;
int sum = 0;
while (instr << n)
    sum += num;

Listing 17.22 uses the overloaded >> operator to read the contents of a string one word at a time.

Listing 17.22 `strin.cpp`

// strin.cpp -- formatted reading from a char array
#include <iostream>
using namespace std;
#include <sstream>
#include <string>
int main()
{
    string lit = "It was a dark and stormy day, and "
                 " the full moon glowed brilliantly. ";
    istringstream instr(lit);   // use buf for input
    string word;;
    while (instr >> word)       // read a word a time
        cout << word << endl;
    return 0;
}

Here is the program output:

It
was
a
dark
and
stormy
day,
and
the
full
moon
glowed
brilliantly.

In short, istringstream and ostringstream classes give you the power of the istream and ostream class methods to manage character data stored in strings.

What Now?

If you have worked your way through this book, you should have a good grasp of the rules of C++. However, that's just the beginning in learning this language. The second stage is learning to use the language effectively, and that is the longer journey. The best situation to be in is a work or learning environment that brings you into contact with good C++ code and programmers. Also, now that you know C++, you can read books that concentrate on more advanced topics and upon object-oriented programming. Appendix H, "Selected Readings," lists some of these resources.

One promise of OOP is to facilitate the development and enhance the reliability of large projects. One of the essential activities of the OOP approach is to invent the classes that represent the situation (called the problem domain) that you are modeling. Because real problems often are complex, finding a suitable set of classes can be challenging. Creating a complex system from scratch usually doesn't work; instead, it's best to take an iterative, evolutionary approach. Toward this end, practitioners in the field have developed several techniques and strategies. In particular, it's important to do as much of the iteration and evolution in the analysis and design stages as possible instead of writing and rewriting actual code.

Two common techniques are use-case analysis and CRC cards. In use-case analysis, the development team lists the common ways, or scenarios, in which they expect the final system to be used, identifying elements, actions, and responsibilities that suggest possible classes and class features. CRC (short for Class/Responsibilities/Collaborators) cards are a simple way to analyze such scenarios. The development team creates an index card for each class. On the card are the class name, class responsibilities, such as data represented and actions performed, and class collaborators, such as other classes with which the class must interact. Then the team can walk through a scenario, using the interface provided by the CRC cards. This can lead to suggesting new classes, shifts of responsibility, and so on.

On a larger scale are the systematic methods for working on entire projects. The most recent of these is the Unified Modeling Language, or UML. This is not a programming language; rather, it is a language for representing the analysis and design of a programming project. It was developed by Grady Booch, Jim Rumbaugh, and Ivar Jacobson, who had been the primary developers of three earlier modeling languages: the Booch Method, OMT (Object Modeling Technique), and OOSE (Object-Oriented Software Engineering), respectively. UML is the evolutionary successor of these three.

In addition to increasing your understanding of C++ in general, you might want to learn about specific class libraries. Microsoft, Borland, and Metrowerks, for example, offer extensive class libraries to facilitate programming for the Windows environment, and Metrowerks offers similar facilities for Macintosh programming.

Summary

A stream is a flow of bytes into or out of a program. A buffer is a temporary holding area in memory that acts as an intermediary between a program and a file or other I/O devices. Information can be transferred between a buffer and a file using large chunks of data of the size most efficiently handled by devices like disk drives. And information can be transferred between a buffer and a program in a byte-by-byte flow that often is more convenient for the processing done in a program. C++ handles input by connecting a buffered stream to a program and to its source of input. Similarly, C++ handles output by connecting a buffered stream to a program and to its output target. The iostream and fstream files constitute an I/O class library that defines a rich set of classes for managing streams. C++ programs that include the iostream file automatically open eight streams, managing them with eight objects. The cin object manages the standard input stream, which, by default, connects to the standard input device, typically a keyboard. The cout object manages the standard output stream, which, by default, connects to the standard output device, typically a monitor. The cerr and clog objects manage unbuffered and buffered streams connected to the standard error device, typically a monitor. These four objects have four wide character counterparts named wcin, wcout, wcerr, and wclog.

The I/O class library provides a variety of useful methods. The istream class defines versions of the extraction operator (>>) that recognize all the basic C++ types and that convert character input to those types. The get() family of methods and the getline() method provide further support for single-character input and for string input. Similarly, the ostream class defines versions of the insertion operator (<<) that recognize all the basic C++ types and that convert them to suitable character output. The put() method provides further support for single-character output. The wistream and wostream classes follow similar support for wide characters.

You can control how a program formats output by using ios_base class methods and by using manipulators (functions that can be concatenated with insertion) defined in the iostream and iomanip files. These methods and manipulators let you control the number base, the field width, the number of decimal places displayed, the system used to display floating-point values, and other elements.

The fstream file provides class definitions that extend the iostream methods to file I/O. The ifstream class derives from the istream class. By associating an ifstream object with a file, you can use all the istream methods for reading the file. Similarly, associating an ofstream object with a file lets you use the ostream methods to write to a file. And associating an fstream object with a file lets you employ both input and output methods with the file.

To associate a file with a stream, you can provide the filename when initializing a file stream object or you can first create a file stream object and then use the open() method to associate the stream with a file. The close() method terminates the connection between a stream and a file. The class constructors and the open() method take an optional second argument that provides the file mode. The file mode determines such things as whether the file is to be read and/or written to, whether opening a file for writing truncates it or not, whether attempting to open a nonexistent file is an error or not, and whether to use the binary or text mode.

A text file stores all information in character form. For example, numeric values are converted to character representations. The usual insertion and extraction operators, along with get() and getline(), support this mode. A binary file stores all information using the same binary representation the computer uses internally. Binary files store data, particularly floating-point values, more accurately and compactly than text files, but they are less portable. The read() and write() methods support binary input and output.

The seekg() and seekp() functions provide C++ random access for files. These class methods let you position a file pointer relative to the beginning of a file, relative to the end, or relative to the current position. The tellg() and tellp() methods report the current file position.

The sstream header file defines istringstream and ostringstream classes that let you use istream and ostream methods to extract information from a string and to format information placed into a string.

Review Questions

1:	What role does the `iostream` file play in C++ I/O?
2:	Why does typing a number such as 121 as input require a program to make a conversion?
3:	What's the difference between the standard output and the standard error?
4:	Why is cout able to display various C++ types without being provided explicit instructions for each type?
5:	What feature of the output method definitions allows you to concatenate output?
6:	Write a program that requests an integer and then displays it in decimal, octal, and hexadecimal form. Display each form on the same line in fields that are 15 characters wide, and use the C++ number base prefixes.
7:	Write a program that requests the information shown below and that formats it as shown: Enter your name: Billy Gruff Enter your hourly wages: 12 Enter number of hours worked: 7.5 First format: Billy Gruff: $ 12.00: 7.5 Second format: Billy Gruff : $12.00 :7.5
8:	Consider the following program: //rq17-8.cpp #include <iostream> using namespace std; int main() { char ch; int ct1 = 0; cin >> ch; while (ch != 'q') { ct1++; cin >> ch; } int ct2 = 0; cin.get(ch); while (ch != 'q') { ct2++; cin.get(ch); } cout << "ct1 = " << ct1 << "; ct2 = " << ct2 << "\n"; return 0; } What does it print, given the following input: I see a q<Enter> I see a q<Enter> Here `<Enter>` signifies pressing the Enter key.
9:	Both of the following statements read and discard characters up to and including the end of a line. In what way does the behavior of one differ from that of the other? while (cin.get() != '\n') continue; cin.ignore(80, '\n');

Programming Exercises

Write a program that counts the number of characters up to the first $ in input and that leaves the $ in the input stream.

Write a program that copies your keyboard input (up to simulated end-of-file) to a file named on the command line.

Write a program that copies one file to another. Have the program take the filenames from the command line. Have the program report if it cannot open a file.

Write a program that opens two text files for input and one for output. The program concatenates the corresponding lines of the input files, using a space as a separator, and writing the results to the output file. If one file is shorter than the other, the remaining lines in the longer file are also copied to the output file. For example, suppose the first input file has these contents:

eggs kites donuts
balloons hammers
stones

And suppose the second input file has these contents:

zero lassitude
finance drama

Then the resulting file would have these contents:

eggs kites donuts zero lassitude
balloons hammers finance drama
stones

Mat and Pat want to invite their friends to a party, much as they did in Chapter 15, "Friends, Exceptions, and More," Programming Exercise 5, except now they want a program that uses files. They ask you to write a program that does the following:

Reads a list of Mat's friends' names from a text file called mat.dat, which lists one friend per line. The names are stored in a container and then displayed in sorted order.
Reads a list of Pat's friends' names from a text file called pat.dat, which lists one friend per line. The names are stored in a container and then displayed in sorted order.
Merges the two lists, eliminating duplicates, and stores the result in the file matnpat.dat, one friend per line.

Consider the class definitions of Programming Exercise 13.5. If you haven't yet done that exercise, do so now. Then do the following:

Write a program that uses standard C++ I/O and file I/O in conjunction with data of types employee, manager, fink, and highfink, as defined in Programming Exercise 13.5. The program should be along the general lines of Listing 17.17 in that it should let you add new data to a file. The first time through, the program should solicit data from the user, then show all the entries, then save the information in a file. On subsequent uses, the program should first read and display the file data, then let the user add data, then show all the data. One difference is that data should be handled by an array of pointers to type employee. That way, a pointer can point to an employee object or to objects of any of the three derived types. Keep the array small to facilitate checking the program:

const int MAX = 10;     // no more than 10 objects
...
employee * pc[MAX];

For keyboard entry, the program should use a menu to offer the user the choice of which type of object to create. The menu will use a switch to use new to create an object of the desired type and to assign the object's address to a pointer in the pc array. Then that object can use the virtual setall() function to elicit the appropriate data from the user:

pc[i]->setall();  // invokes function corresponding to type of object

To save the data to a file, devise a virtual writeall() function for that purpose:

for (i = 0; i < index; i++)
    pc[i]->writeall(fout);// fout ofstream connected to output file

Note

graphics/common.gif

Use text I/O, not binary I/O, for this exercise. (Unfortunately, virtual objects include pointers to tables of pointers to virtual functions, and write() copies this information to a file. An object filled by using read() from the file gets weird values for the function pointers, which really messes up the behavior of virtual functions.) Use a newline to separate each data field from the next; this makes it easier to identify fields on input. Or you could still use binary I/O, but not write objects as a whole. Instead, you could provide class methods that apply the write() and read() functions to each class member individually rather than to the object as a whole. That way, the program can save just the intended data to a file.

The tricky part is recovering the data from the file. The problem is, how can the program know whether the next item to be recovered is an employee object, a manager object, a fink type, or a highfink type? One approach is, when writing the data for an object to a file, precede the data with an integer indicating the type of object to follow. Then, on file input, the program can read the integer, then use a switch to create the appropriate object to receive the data:

enum classkind{Employee, Manager, Fink, Highfink}; // in class header
...
int classtype;
while((fin >> classtype).get(ch)){ // newline separates int from data
    switch(classtype) {
        case Employee  : pc[i] = new employee;
                      : break;

Then you can use the pointer to invoke a virtual getall() function to read the information:

pc[i++]->getall();

CONTENTS

Chapter 17. INPUT, OUTPUT, AND FILES

An Overview of C++ Input and Output

Streams and Buffers

Figure 17.1. C++ input and output.

Figure 17.2. A stream with a buffer.

Streams, Buffers, and the iostream File

Figure 17.3. Some I/O classes.

Redirection

Output with cout

The Overloaded << Operator

Output and Pointers

Output Concatenation

Figure 17.4. Output concatenation.

The Other ostream Methods

Listing 17.1 write.cpp

Flushing the Output Buffer

Formatting with cout

Listing 17.2 defaults.cpp

Changing the Number Base Used for Display

Listing 17.3 manip.cpp

Adjusting Field Widths

Listing 17.4 width.cpp

Fill Characters

Listing 17.5 fill.cpp

Setting Floating-Point Display Precision

Listing 17.6 precise.cpp

Printing Trailing Zeros and Decimal Points

Listing 17.7 showpt.cpp

More About setf()

Table 17.1. Formatting Constants

Listing 17.8 setf.cpp

Table 17.2. Arguments for setf(long, long)

Listing 17.9 setf2.cpp

Standard Manipulators

Table 17.3. Some Standard Manipulators

The iomanip Header File

Listing 17.10 iomanip.cpp

Table 17.4. Formatting Changes

Input with cin

How cin >> Views Input

Figure 17.5. cin >> skips over whitespace.

Listing 17.11 check_it.cpp

Stream States

Table 17.5. Stream States

Setting States

I/O and Exceptions

Listing 17.12 cinexcp.cpp

Stream State Effects

Other istream Class Methods

Single-Character Input

Table 17.6. cin.get(ch) Versus cin.get()

Which Form of Single-Character Input?

String Input: getline(), get(), and ignore()

Listing 17.13 get_fun.cpp

Unexpected String Input

Table 17.7. Changes in Input Behavior

Other istream Methods

Listing 17.14 peeker.cpp

Program Notes

Listing 17.15 truncate.cpp

File Input and Output

Simple File I/O

Listing 17.16 file.cpp

Opening Multiple Files

Command-Line Processing

Listing 17.17 count.cpp

Stream Checking and is_open()

File Modes

Table 17.8. File Mode Constants

Figure 17.6. Some file-opening modes.

Table 17.9. C++ and C File-Opening Modes

Appending to a File

Listing 17.18 append.cpp

Binary Files

Figure 17.7. Binary and text representation of a floating-point number.

Listing 17.19 binary.cpp

Random Access

Listing 17.20 random.cpp

Incore Formatting

Listing 17.21 strout.cpp

Streams, Buffers, and the `iostream` File

Output with `cout`

The Overloaded `<<` Operator

The Other `ostream` Methods

Listing 17.1 `write.cpp`

Formatting with `cout`

Listing 17.2 `defaults.cpp`

Listing 17.3 `manip.cpp`

Listing 17.4 `width.cpp`

Listing 17.5 `fill.cpp`

Listing 17.6 `precise.cpp`

Listing 17.7 `showpt.cpp`

More About `setf()`

Listing 17.8 `setf.cpp`

Table 17.2. Arguments for `setf(long, long)`

Listing 17.9 `setf2.cpp`

The `iomanip` Header File

Listing 17.10 `iomanip.cpp`

Input with `cin`

How `cin >>` Views Input

Figure 17.5. `cin >>` skips over whitespace.

Listing 17.11 `check_it.cpp`

Listing 17.12 `cinexcp.cpp`

Other `istream` Class Methods

Table 17.6. `cin.get(ch)` Versus `cin.get()`

String Input: `getline()`, `get()`, and `ignore()`

Listing 17.13 `get_fun.cpp`

Other `istream` Methods

Listing 17.14 `peeker.cpp`

Listing 17.15 `truncate.cpp`

Listing 17.16 `file.cpp`

Listing 17.17 `count.cpp`

Stream Checking and `is_open()`

Listing 17.18 `append.cpp`

Listing 17.19 `binary.cpp`

Listing 17.20 `random.cpp`

Listing 17.21 `strout.cpp`

Listing 17.22 `strin.cpp`