A *data object *(or just *object*) is a sequence of bytes to be manipulated as a unit. For now, "object" is just a technical name for a variable. When we discuss pointers we will see that there are other objects besides variables.

We start our consideration of reliable data structures with the low-level objects, the individual scalars. In this book, a *scalar *object is one having floating, integer, or pointer type.

Our search for non-surprising programs begins with the math library and the floating-point numbers that it supports. These are some of the more common math functions:

double ceil(x) - smallest integer not less than x

double cos(x) - cosine of x

double exp(x) - exponential function of x

double floor(x) - largest integer not greater than x

double log(x) - natural log of x

double log10(x) - base-0 log of x

double pow(x, y) - raise x to the power y

double sin(x) - sine of x

double sqrt(x) - square root of x

double tan(x) - tangent of x

All of these functions deal with floating-point numbers (of the size specified by double), which are only binary approximations of a "true" real number. Also, each library function that produces a fractional result may introduce some small error in the result. For example, in "real" numbers, multiplying the square root of x by itself should produce exactly the value of x. The actual results typically show a small error in the low-order bits, something like this [2-1]:

sqrtx.c:

/* sqrttst - test accuracy of sqrt

*/

#include "local.h"

main()

{

double x, y;

for (x = 1.; x <= 25.; x = x + 1.)

{

y = sqrt(x) * sqrt(x);

if (y != x)

printf("%5.1f %21.17f %10.2e\n", x, y, x-y);

}

}

sqrtx.out:

2.0 2.00000000000000010 -5.55e-17

3.0 3.00000000000000010 -5.55e-17

7.0 6.99999999999999990 2.22e-16

8.0 8.00000000000000020 -2.22e-16

10.0 10.00000000000000000 -2.22e-16

12.0 12.00000000000000000 -2.22e-16

15.0 15.00000000000000000 -2.22e-16

17.0 17.00000000000000100 -4.44e-16

18.0 18.00000000000000000 4.44e-16

21.0 21.00000000000000000 4.44e-16

22.0 22.00000000000000000 4.44e-16

23.0 22.99999999999999900 8.88e-16

Another aspect of floating representation concerns conversion from floating-point to integer. When a positive floating-point value is converted to an integer value, the fractional digits are lost, i.e., *truncated* [2-2]. When a negative floating-point value is converted, the truncation may be toward or away from zero, depending on implementation.

**Rule 2**-**1:** When exactness counts in converting floating-point to integer, be sure the value being converted is non-negative.

Another type of rounding behavior is needed when using floating-point numbers to represent decimal fractions such as currency. It is possible to use floating representations for small amounts of currency, such as six figures with two decimal places; errors in the low-order bits will not produce a noticeable error in addition or subtraction. However, when arbitrary fractions are computed, they must be rounded to the proper number of decimal places. For example, consider the computation of monthly payment on a mortgage [Plum, 1983, 3-45]:

pmt = bal * (intmo / (1. - pow(1. + intmo, (double)-npmts)));

With balance (bal) equal to $82,500.00, monthly interest (intmo) equal to 16.25% / 12, and number of payments (npmts) equal to 25 times 12 (i.e. 300), the C computation produces a result (pmt) equal to 1137.29654996 (showing just eight places). Since this result is about 35/100 different from an exact number of pennies, it will not take many additions or subtractions before the intermediate results start to differ noticeably from true figures.

round:

/* round - adjust floating dollars to even pennies */

#include "local.h"

double round(dollars)

double dollars;

{

double pennies;

pennies = floor(dollars * 100 + .5);

return (pennies / 100.);

}

roundup:

/* roundup - adjust floating dollars to even pennies

* round all fractional pennies upward

*/

#include "local.h"

double roundup(dollars)

double dollars;

{

double pennies;

pennies = ceil(dollars * 100.);

return (pennies / 100.);

}

Using roundup(1137.2925) produces 1137.30.

These observations on representation and rounding are a few of the simpler things that can be said about avoiding surprises with floating-point data. If you have uses for floating-point arithmetic that are more complex than these simple examples, consult a text on numerical analysis such as Hamming [1973].

We turn now from errors of representation to errors of computation. Most of the math functions can fail, if they are given arguments which do not produce a proper result. For example, the square root of a negative number, such as sqrt(-1.0), has no meaning in "real" arithmetic. This is known as a *domain error*; the argument is outside the domain over which the function is defined. For another example, ten raised to the one-millionth power, pow(10., 1e6), cannot be represented in any (current) floating representation. This is known as a *range error*; the result cannot be represented as a proper double number. In all such error cases, the function will return some value, but the value returned will not be the "correct" value [2-3]. The sqrt function returns zero, when given a negative argument. The pow function returns a very large number (appropriately positive or negative) on an "overflow" range error. This "very large number" is defined in <math.h> as HUGE_VAL (or in some versions, HUGE).

I would advise you against testing for errors by comparing the returned value against HUGE_VAL (or -HUGE_VAL, or zero), for several reasons. First of all, these are in general valid (albeit unlikely) data values. Secondly, making such tests requires detailed knowledge of the various error returns for each math function - knowledge which is otherwise useless. Also, there are three different possibilities - -HUGE_VAL, O, and HUGE_VAL - and you must know which are possible in each case. Finally, different versions of the library have differed in their error-return behavior.

errno = 0;

perform computation

if (errno != 0)

handle the error situation: report, compute differently, etc.

else

computation was error-free

**Rule 2**-**2:** Test errno before using results from the math functions.

(See the appendix for a list of math functions in the library of recent C compilers [2-4].)

One important issue in "no surprises" programming is how to handle errors encountered during computation. To be general about it, we define "error" as any indication that the program cannot produce the expected result. Associated with each possible error is a "restart point," the place at which some action is taken so that the process can continue. The method of handling errors often depends upon whether the program is intended for use in a "stand-alone" environment, where execution must continue no matter what, or for use under an operating system, where a user is available to re-execute the command.

Throughout the rest of the book, we will refer to these categories of error-handling:

*Topics assumed from previous book:* <ctype.h> [3.5].

In Chapter 1, a *safe *macro function is defined to be one which evaluates each argument exactly once, and observed that safe macro functions can reliably be used as replacements for function calls. This approach to macro-versus-function tradeoff is most simply illustrated by the "character-tests" in <ctype.h>. These have usually been implemented as safe macros. ANSI C requires that they also have true function versions in the library. As mentioned in Chapter 1, the true function version of, for example, isdigit can be obtained by using

#undef isdigit

(See the appendix for a list of the character-test functions from the library of recent C compilers[2-5].)

These functions are the most portable way of testing and converting characters, so their use is encouraged.

**Rule 2**-**3: **Use the <ctype.h> facilities for character tests and upper-lower conversions.

*Topics assumed from previous book: *Semi-Boolean tests [3.2], bool [3.21], tbool [3.21].

if (c) instead of if (c != '\0') and

if (p) instead of if (p != NULL)

There is no problem with portability here; nul character and NULL pointer are always guaranteed to be equal to zero in comparisons. However, the C syntax does have a potential for one common bug:

if (i = j)either wrong or misleading

is syntactically valid; it means the same as

if ((i = j) != 0)

From the human reader's point of view, the program will be easier to understand if Boolean variables are clearly indicated as such. In the type scheme of *C Programming Guidelines* [Plum, 1984], two defined-types are provided for Boolean variables:

bool designates an int-sized Boolean variable, and

tbool designates a char-sized Boolean variable.

With an eye toward future automated assistance, we propose the following reliability rules:

**Rule 2**-**4: **Make sure that Boolean variables are assigned the values zero and one. This means that the type tbool is always adequate, and if this rule is part of local standards, the types bool and tbool could be made synonymous.

**Rule 2**-**5: **Make sure that each test condition is Boolean, involving only Boolean types or relational and logical operators.

In most modern compilers, there is no efficiency penalty for saying

while (*s1++ != '\0')

instead of

while (*s1++)not strictly Boolean

Recent C compilers provide *enumeration types* , which allow declarations like this:

enum stoplight {red, yellow, green};

This establishes enum stoplight as a type which can be used in further declarations, such as

enum stoplight main_st, front_st;

enum stoplight {red = 'r', yellow = 'y', green = 'g'};

main_st = red;

Failure to attend to proper ranges of variables can lead to interesting reliability problems. One example from the field of computerized aircraft navigation: a heading variable was specified to be measured in degrees. But it was not discovered until airborne system test that one part of the system assumed that heading ranged from 0 to 360, while another part assumed that the range was -180 to 180.

For a concise representation of ranges in comments, we can use {low:high}. (Using a "two-dot" notation like "low..high" might seem more natural, but it becomes confusing when the range values are floating-point constants with decimal points.) The specification of a range comment could look like this:

double heading; /* degrees: {0:360} */

asserts(0<= heading && heading <= 360, "heading range ok");

Tests of this form are common enough to deserve a macro:

#define IN_RANGE(n, lo, hi) ((lo) <= (n) && (n) <= (hi))

which would allow the assertion to be given in this form:

asserts(IN_RANGE(heading, 0, 360), "heading range ok");

In C, it is often appropriate to consider the "one-too-far" value as part of the range of a variable. Consider, for example, a variable month_no which is a subscript into an array of monthly information. We might consider that its range would be from 0 through 11, but such variables are often used as loop indexes in contexts like this:

for (month_no = 0; month_no < 12; ++month_no)

processarray[month_no];

The discussion of ranges would not be complete without considering the sizes of objects and the ranges of subscripts. C is moving toward the ability to handle objects whose size cannot be represented in an unsigned int; ANSI C envisions that some environments will need to use unsigned long to represent the sizeof anything. Several of the standard headers will define a type named size_t, which will be either unsigned int or unsigned long, according to the environment. Hence this rule:

For an example of the use of size_t, consider the reverse function, which reverses the characters in a string. To be applicable to any string in any environment, the function needs to use a subscript of type size_t:

reverse:

/* reverse - reverse the order of a string */

#include "local.h"

void reverse(s)

char s[ ];

{

char t;

size_t i, j;

if ((j = strlen(s)) == 0)

return;

for (i = 0, j = j - 1; i < j; ++i, -- j)

SWAP(s[i], s[j], t);

}

A related question concerns how a subscript variable should be declared. The lowest subscript of an array is always zero in C, but regarding the highest subscript there are new complications with recent versions of C. Some implementations allow declaration of arrays with more elements than the largest (signed) int value, whereas in K&R [Kernighan and Ritchie, 1978], subscripts were always representable as int values. Recent compilers for C allow for larger subscripts. For one example, the "large memory model" of the 8086 family of machines allows for objects larger than 32K elements, but the int is still 16 bits. Thus, in general, portable code (especially library functions callable in any context) must assume that a subscript could require an unsigned representation. Fortunately for portability, there is always a valid "one-too-far" value greater than the largest possible subscript, because in zero-origin indexing, the largest subscript is always one less than the largest dimension value for the array. However, a variable capable of holding the largest subscript must be an unsigned variety of integer, which eliminates the possibility of a "one-too-far" value at the low end.

*Topics assumed from previous book:* ones complement [2.1], twos complement [2.1], overflow [3.20].

In this section, all our examples will be easier to describe if we assume that int has 16 bits, but the principles are the same for any size of int. Most C machines use twos complement arithmetic, in which the smallest int has the value -32768, and the largest int has the value 32767. (These values are named INT_MIN and INT_MAX, respectively, in <limits.h>, as described in Section 1.1.) On a hypothetical 16-bit ones complement machine, INT_MIN would be -32767 and INT_MAX would be 32767. Notice that the very smallest twos-complement integer has no counterpart in ones complement, nor does it have a corresponding positive integer in either system.

While twos complement is the most popular usage in the C world, programmers aiming for the widest portability must consider various problems regarding ones complement, even if their current machine does not use it. More precisely, these are reliability problems for portable code, in that they can cause nasty surprises if not attended to. We shall see several such problems in this section.

+- ... +-+-+-+-+ ... -+

INT_MIN 0 1 2 INT_MIX

int temp0, temp1, delta_temp;

delta_temp = temp1 - temp0;

unsigned int delta_time, timer0, timer1;

delta_time = timer1 - timer0;

Assuming that our integers are 16-bit values, the largest unsigned int value (named UINT_MAX in <limits.h>) has the value 65535. If the value of timer0 is 12, say, and the value of timer1 (sampled some time later) is 10, then delta_time will have the value 65534; i.e., timer1 was sampled 65534 ticks later than timer0. By the definition of unsigned arithmetic, no overflow is possible in the computation. The "wrap-around" that can occur is not considered an error, whereas overflow is an error. From the formal point of view, any result of the subtraction is a well-defined result. From the practical point of view, the result corresponds to the actual delta_time only if the true value is less than or equal to UINT_MAX (65535, in this environment). This "timer" model corresponds closely to the underlying semantics of C, in that the difference between a later time and an earlier time is always a positive number.

Notice that it would be an error to assign the unsigned int value of delta_time to an int variable; all the values greater than INT_MAX would exceed the range of the int. Such an assignment is not a syntax error, however; the compiler would map a very large value into some negative integer value. The details of the mapping will be discussed shortly.

Programmers often use unsigned arithmetic as if it modeled the "positive number line," like the measurement of weight:

unsigned int weight0, weight1;

int delta_weight;

delta_weight = weight1 - weight0;

In this sort of usage, the difference between the two unsigned numbers is conceptually a signed number, according to whether one weight is greater or lesser than the other. In twos complement arithmetic, this computation is quite well-behaved and unobjectionable. The unsigned difference between weight1 and weight0 when converted into a signed number produces just the right result, as long as the "real" result is between INT_MIN and INT_MAX, inclusive.

In ones complement, however, there is an uncertainty about the conversion. Assigning the unsigned int value to the (signed) int variable can be done in any way that the compiler writer has chosen. The choice may have been to model the twos-complement behavior, in which case one possible result (-32768) is unrepresentable. (We will call this the "twos-complement mimic" treatment.)

Alternatively, the compiler writer may have chosen to simply assign the same bit representation, in which case all negative results will be "one greater" than expected. (We call this the "raw-bits" treatment.)

#define UI_TO_I(n) (((n) & 0x800000000) == 0 ? (n) : -(n) -1)

In a twos-complement environment, all that UI_TO_I has to do is to cast the result to int:

#define UI_TO_I(n) (int)(n)

The problems arising from subtraction of unsigned int values are not restricted to obscure portability problems regarding ones-complement systems. If they were, most programmers could safely ignore them. But as these examples show, the difference between unsigned values has a built-in duality of interpretation. The situation is as if C had a special type available for unsigned difference; let us call it "questionably signed," or qsigned, to invent a C-like notation. A qsigned number is a value on a number line with two different, coexisting value labelings:

UINT_MAX

V 0 1 2 INT_MAX [unsigned interpretation]

+- ... +-+-+-+-+ ... -+

INT_MIN? -1 0 1 2 INT_MAX [signed (2s-comp) interpretation]

So far, this is just an explanation of the underlying meaning of well-behaved computations. However, if a qsigned int value is assigned to a long int variable (in an environment where int is smaller than long), and no explicit conversion is used, there is an intrinsic confusion. In the examples above, one could be misled into thinking that

long delta_timer;

delta_timer = timer1 - timer0;

produces a non-negative value, and that

long delta_weight;

delta_weight = weight1 - weight0;

delta_timer = (unsigned)(timer1 - timer0);

delta_weight = UI_TO_I(weight1 - weight0);

This can all be summarized by a reliability rule:

One final observation about integer signs: the sign of the remainder (%) operator is implementation-defined, when the operands are of different sign. This sometimes constitutes a portability problem, when the programmer has assumed that i % j is always positive. To provide a true (never negative) modulo operation, we can use an IMOD ("integer modulo") macro from portdefs.h:

/* modulo macro giving non-negative result */

#define IMOD(i, j) (((i) % (j)) < 0 ? ((i) % (j)) + (j) : ((i) % (j)))

/* if i % j is never negative, replace with the following line: */

/* #define IMOD(i, j) ((i) % (j)) */

Strictly speaking, there are two different types of overflow. When an object is assigned a value that is too large for the object, we have a "range" overflow. And when an operation produces a result that is too large for the intermediate result, we have an "intermediate" overflow. The consequences are the same in both cases: the end result is incorrect. Range overflows have been discussed earlier; now we will attend to intermediate overflows.

atopi(#1):

/* atopi - convert string to positive integer */

int atopi(s)

char s[ ];

{

int n = 0;

int i;

for (i = 0; isdigit(s[i]); ++i)

n = 10 * n + s[i] - '0';

return (n);

}

We will refer to a "fussy" overflow as one which would not prevent the computation of the correct answer if overflow-checking were not performed. For example, during the computation of atopi("32767"), a "fussy" overflow takes place when the final '7' is added:

3276 * 10 = 32760

+ '7' + 55

------

32815momentarily greater thanINT_MAX

-'0' - 48

------

32767now within proper range

On a machine without overflow checking, "fussy" overflows are often ignored by the programmer. The overflow causes a "wraparound" in the temporary value, but the subsequent subtraction brings the result back into range. The programmer is often unaware of the overflow until the program is ported to a machine which checks for overflow in the machine hardware. One simple way to prevent fussy overflows is to perform the operation in unsigned arithmetic, since overflow checking is explicitly disabled for unsigned operations. A more precise way, in the case of atopi, would be to parenthesize the problematic expression like this:

n = 10 * n + (s[i] - '0');

If the operator inside the parentheses were a "plus," the technique would not work, because the compiler is free to re-arrange commutative-associative operators, even across parentheses. Since, however, it is a "minus," the subtraction is guaranteed to be performed before the addition (unless the compiler has a bug in this delicate area, that is).

atopi(#2):

/* atopi - convert string to positive integer {0:INT_MAX} */

#include <limits.h>

int atopi(s)

char s[ ];

{

unsigned n = 0; /* the converted number: {0:INT_MAX} */

int i;

for (i = 0; isdigit(s[i]) && n <= INT_MAX/10; ++i)

{

if (n > INT_MAX / 10)

return (-1);

n = 10 * n + (s[i] - '0');

}

if (n > INT_MAX isdigit(s[i]))

return (-1);

else

return (n);

}

By "type," we refer both to the way the data is represented in memory, and to the syntactic type on the declaration. Early C compilers were rather loose about the agreement of data types; code like this was allowable:

char *p;

int i;

i = p + 2; loose and dangerous

Recent C compilers will flag this free interchange between pointers and integers as a syntax error. One reason for the tightening-up is that programmers have been found to make mistakes in mixing up pointers and integers, which leads to obscure bugs. Another reason is that, in some environments, an ordinary int is not large enough to hold a pointer. In the so-called "large model" of the Intel 8086 and 8088, a pointer takes 32 bits, while the int is only 16 bits.

int i, j;

char c;

i = j + c;

The most basic distinction when talking about data properties is the distinction between an *undefined* value (e.g., "garbage," or uninitialized), and a *defined* value. One common example of a "undefined" error comes from forgetting to initialize a counter:

long sum;

int i;

for ( i = 0; i < N; ++i )

sum += a[i];error-sum not initialized

n = m / 0;

causes n to become undefined, possibly terminating execution as well.

Besides this fundamental distinction between defined and undefined values, a variable may have other properties ascribed to it by the programmer. In Section 2.6, we discussed programmer-defined ranges for variables, documented like this:

double heading; /* degrees: {0:360} */

By this comment, the programmer indicates that the property of being between 0 and 360 is an important property for this variable. If heading receives the value -1, the variable is "defined" from the point of view of C, but it is not properly defined according to the programmer's intention.

In previous sections, we saw other properties that the programmer can ascribe to a variable, by using an informative defined-type: metachar (a valid char-sized number or EOF), bool and tbool (zero or one), and bits (an unsigned number for bit manipulation).

We will use the term *well-defined* to refer to a value that conforms to the programmer's stated defining property. If, for example, a metachar variable were to receive the value 5000, the variable would not be "well-defined," because it exceeds the range of proper char-sized numbers.

Besides stating desired properties on the declaration of variables, the programmer can also document properties of function returned values. A function declared like this

bool is_empty(); /* is node empty? */

or like this

int is_empty(); /* is node empty?: bool */

is indicated to return only Boolean values. This documentation is important for determining that the function is producing a value that is appropriate for the context in which it is used.

We will be following a convention that a comment with a colon followed by a property name means "must have this property."

**Rule 2**-**13**: Document the defining properties of declared names in a comment on the declaration, using a convention such as "colon followed by property."

The type and properties of a expression determine the operations that are allowable on that expression. For example, a bool variable should not be the operand of arithmetic operations like multiply or divide; the operations just do not make sense for the intended properties of the variable. Sometimes the type itself restricts the operations. For example, bitwise operations are not allowed upon any floating-point values.

In later chapters, we will see more complicated objects such as arrays, structures, stacks, and trees, each of which has its own repertoire of allowable operations.