By "exact division," we mean division in which it is known beforehand, somehow, that the remainder is 0. Although this situation is not common, it does arise, for example, when subtracting two pointers in the C language. In C, the result of p - q, where p and q are pointers, is well defined and portable only if p and q point to objects in the same array [H&S, sec. 7.6.2]. If the array element size is s, the object code for the difference p - q computes (p - q)/s.

The material in this section was motivated by [GM, sec. 9].

The method to be given applies to both signed and unsigned exact division, and is based on the following theorem.

Theorem MI. If a and m are relatively prime integers, then there exists an integer a? 1 a?/span> < m, such that

Thus a?/span> is a multiplicative inverse of a, modulo m. There are several ways to prove this theorem; three proofs are given in [NZM, 52]. The proof below requires only a very basic familiarity with congruences.

Proof. We will prove something a little more general than the theorem. If a and m are relatively prime (and hence nonzero), then as x ranges over all m distinct values modulo m, ax takes on all m distinct values modulo m. For example, if a = 3 and m = 8, then as x ranges from 0 to 7, ax = 0, 3, 6, 9, 12, 15, 18, 21 or, reduced modulo 8, ax = 0, 3, 6, 1, 4, 7, 2, 5. Observe that all values from 0 to 7 are present in the last sequence.

To see this in general, assume that it is not true. Then there exist distinct integers that map to the same value when multiplied by a; that is, there exist x and y, with x y (mod m), such that

But then there exists an integer k such that

Because a has no factor in common with m, it must be that x - y is a multiple of m; that is,

This contradicts the hypothesis.

Now, because ax takes on all m distinct values modulo m, as x ranges over the m values, it must take on the value 1 for some x.

The proof shows that there is only one value (modulo m) of x such that ax 1 (mod m)梩hat is, the multiplicative inverse is unique, apart from additive multiples of m. It also shows that there is a unique (modulo m) integer x such that ax b (mod m)where b is any integer.

As an example, consider the case m = 16. Then because 3?1 = 33 1 (mod 16). We could just as well take because 3?-5) = -15 1 (mod 16). Similarly, because (-3)? = -15 1 (mod 16).

These observations are important because they show that the concepts apply to both signed and unsigned numbers. If we are working in the domain of unsigned integers on a 4-bit machine, we take In the domain of signed integers, we take But 11 and -5 have the same representation in two's-complement (because they differ by 16), so the same computer word contents can serve in both domains as the multiplicative inverse.

The theorem applies directly to the problem
of division (signed and unsigned) by an odd integer d
on a W-bit computer. Because any odd integer is
relatively prime to 2^{W}, the theorem
says that if d is odd, there exists an integer d?/span> (unique in the range 0 to 2^{W} - 1 or in the range -2^{W} ^{- 1} to 2^{W} ^{- 1} - 1) such that

Hence for any integer n that is a multiple of d,

In other words, n/d can be calculated by multiplying n by d?/span>, and retaining only the rightmost W bits of the product.

If the divisor d
is even, let d = d_{o}
?2^{k}, where d_{o} is odd and k
1. Then, simply shift n right k positions (shifting out 0's), and then multiply by (the shift could be done after the
multiplication as well).

Below is the code for division of n by 7, where n is a multiple of 7. This code gives the correct result whether it is considered to be signed or unsigned division.

li牋?M,0xB6DB6DB7?Mult. inverse, (5*2**32 + 1)/7.

mul牋 q,M,n牋牋牋牋 q = n/7.

How can we compute the multiplicative inverse? The standard method is by means of the "extended Euclidean algorithm." This is briefly discussed below as it applies to our problem, and the interested reader is referred to [NZM, 13] and to [Knu2, sec. 4.5.2] for a more complete discussion.

Given an odd divisor d, we wish to solve for x

where, in our application, m =
2^{W} and W
is the word size of the machine. This will be accomplished if we can solve for
integers x and y
(positive, negative, or 0) the equation

Toward this end, first make d positive by adding a sufficient number of multiples of m to it. (d and d + km have the same multiplicative inverse.) Second, write the following equations (in which d, m > 0):

If d = 1, we are done, because (ii) shows that x = 1. Otherwise, compute

Third, multiply Equation (ii) by q and subtract it from (i). This gives

This equation holds because we have simply multiplied one equation by a constant and subtracted it from another. If rem(m - d, d) = 1, we are done; this last equation is the solution and x = - 1 - q.

Repeat this process on the last two equations, obtaining a fourth, and continue until the right-hand side of the equation is 1. The multiplier of d, reduced modulo m, is then the desired inverse of d.

Incidentally, if m - d < d, so that the first quotient is 0, then the third row will be a copy of the first, so that the second quotient will be nonzero. Furthermore, most texts start with the first row being

but in our application m = 2^{W} is not representable in the machine.

The process is best illustrated by an example. Let m = 256 and d = 7. Then the calculation proceeds as follows. To get the third row, note that

7(-1) + 256( 1) = 249

7( 1) + 256( 0) = 7

7(-36) + 256( 1) = 4

7( 37) + 256(-1) = 3

7(-73) + 256( 2) = 1

Thus, the multiplicative inverse of 7, modulo 256, is -73 or, expressed in the range 0 to 255, is 183. Check: 7?83 = 1281 1 (mod 256).

From the third row on, the integers in the right-hand column
are all remainders with respect to the number above it as a divisor (d being the dividend), so they form a sequence of
strictly decreasing nonnegative integers. Therefore, the sequence must end in 0
(as the above would if carried one more step). Furthermore, the value just
before the 0 must be 1, for the following reason. Suppose the sequence ends in b followed by 0, with b
1. Then, the
integer preceding the b must be a multiple of b, let's say k_{1}b, for the next remainder to be 0. The integer
preceding k_{1}b must be of the form k_{1}k_{2}b + b, for the next remainder to be b. Continuing up the sequence, every number must be a
multiple of b, including the first two (in the
positions of the 249 and the 7 in the above example). But this is impossible,
because the first two integers are m - d and d, which are
relatively prime.

This constitutes an informal proof that the above process terminates, with a value of 1 in the right-hand column, and hence it finds the multiplicative inverse of d.

To carry this out on a computer, first note that if d < 0 we should add 2^{W}
to it. But with two's-complement arithmetic it is not necessary to actually do
anything here; simply interpret d as an
unsigned number regardless of how the application interprets it.

The computation of q must use unsigned division.

Observe that the calculations can be done modulo m, because this does not change the right-hand column
(these values are in the range 0 to m - 1
anyway). This is important, because it enables the calculations to be done in
"single precision," using the computer's modulo-2^{W} unsigned arithmetic.

Most of the quantities in the table need not be represented. The column of multiples of 256 need not be represented, because in solving dx + my = 1, we do not need the value of y. There is no need to represent d in the first column. Reduced to its bare essentials, then, the calculation of the above example is carried out as follows:

255?249

牋1牋?7

220牋?4

?7牋?3

183牋?1

A C program for performing this computation is shown in Figure 10-4.

unsigned mulinv(unsigned d) {牋牋牋牋牋 // d must be odd.

牋爑nsigned x1, v1, x2, v2, x3, v3, q;

牋 x1 = 0xFFFFFFFF;牋牋 v1 = -d;

牋爔2 = 1;牋牋牋牋牋牋?v2 = d;

牋爓hile (v2 > 1) {

牋 牋爍 = v1/v2;

牋牋牋x3 = x1 - q*x2;牋 v3 = v1 - q*v2;

牋牋牋x1 = x2;牋牋牋牋?v1 = v2;

牋牋牋x2 = x3;牋牋牋牋?v2 = v3;

牋爙

牋爎eturn(x2);

}

The reason the loop continuation condition is `(v2` `>`
`1)` rather than the more natural `(v2` `!=`
`1)` is that if the latter
condition were used, the loop would never terminate if the program were invoked
with an even argument. It is best that programs not loop forever even if
misused. (If the argument `d` is
even, `v2` never takes on the
value 1, but it does become 0.)

What does the program compute if given an even argument? As
written, it computes a number x such that dx 0 (mod 2^{32}), which is probably not
useful. However, with the minor modification of changing the loop continuation
condition to `(v2` `!=` `0)`
and returning `x1` rather than `x2`, it computes a number x such that dx g (mod 2^{32}) where g is the greatest common divisor of d and 2^{32}梩hat is, the greatest power of 2
that divides d. The modified program still
computes the multiplicative inverse of d for d odd, but it requires one more iteration than the
unmodified program.

As for the number of iterations (divisions) required by the above program, for d odd and less than 20, it requires a maximum of 3 and an average of 1.7. For d in the neighborhood of 1000, it requires a maximum of 11 and an average of about 6.

It is well known that, over the real numbers, 1/d, for d 0, can be calculated to ever increasing accuracy by iteratively evaluating

provided the initial estimate x_{0}
is sufficiently close to 1/d. The number of
digits of accuracy approximately doubles with each iteration.

It is not so well known that this same formula can be used to
find the multiplicative inverse in the domain of modular arithmetic on
integers! For example, to find the multiplicative inverse of 3, modulo 256,
start with x_{0} = 1 (any odd number
will do). Then,

The iteration has reached a fixed point modulo 256, so -85, or 171, is the multiplicative inverse of 3 (modulo 256). All calculations can be done modulo 256.

Why does this work? Because if x_{n}
satisfies

and if x_{n} _{+ 1}
is defined by (31), then

To see this, let dx_{n}
= 1 + km. Then

In our application, m is a
power of 2, say 2^{N}. In this case, if

In a sense, if x_{n}
is regarded as a sort of approximation to d?/span>,
then each iteration of (31) doubles the number of bits of "accuracy"
of the approximation.

It happens that, modulo 8, the multiplicative inverse of any
(odd) number d is d
itself. Thus, taking x_{0} = d is a reasonable and simple initial guess at d?/span>. Then, (31) will give values of x_{1}, x_{2},
? such that

Thus, four iterations suffice to find the multiplicative
inverse modulo 2^{32} (if x 1 (mod 2^{48})
then x 1 (mod 2^{n})
for n 48). This leads to the C program in Figure 10-5, in which all computations are
done modulo 2^{32}.

unsigned mulinv(unsigned d) {牋牋牋 // d must be odd.

牋爑nsigned xn, t;

牋 xn = d;

loop: t = d*xn;

牋牋牋if (t == 1) return xn;

牋牋牋xn = xn*(2 - t);

牋牋牋goto loop;

}

For about half the values of d,
this program takes 4 1/2 iterations, or nine multiplications. For the other
half (those for which the initial value of `xn`
is "correct to 4 bits"梩hat is, d^{2}
1 (mod 16)), it
takes seven or fewer, usually seven, multiplications. Thus, it takes about
eight multiplications on average.

A variation is to simply execute the loop four times,
regardless of d, perhaps "strung out"
to eliminate the loop control (eight multiplications). Another variation is to
somehow make the initial estimate x_{0}
"correct to 4 bits" (that is, find x_{0}
that satisfies dx_{0} 1 (mod 16)). Then, only three
loop iterations are required. Some ways to set the initial estimate are

Here, the multiplication by 2 is a left shift, and the
computations are done modulo 2^{32} (ignoring overflow). Because the
second formula uses a multiplication, it saves only one.

This concern about execution time is of course totally unimportant for the compiler application. For that application, the routine would be so seldom used that it should be coded for minimum space. But there may be applications in which it is desirable to compute the multiplicative inverse quickly.

We conclude this section with a listing of some multiplicative inverses in Table 10-3.

You may notice that in several cases (d = 3, 5, 9, 11), the multiplicative inverse of d is the same as the magic number for unsigned
division by d (see Section 10-14,
"Sample Magic Numbers," on page 189). This is more or less a
coincidence. It happens that for these numbers, the magic number M is equal to the multiplier m, and these are of the form (2^{p} + 1)/d, with p 32. In
this case, notice that

so that M d?/span> (mod 2^{32}).