At this point you may wonder if other
divisors present other problems. We see in this section that they do not; the
three examples given illustrate the only cases that arise (for d 2).
Some of the proofs are a bit complicated, so to be cautious, the work is done in terms of a general word size W.
Given a word size W
3 and a divisor d, 2
d < 2W - 1, we wish to find the
least integer m and integer p such that
with 0 m < 2W
and p
W.
The reason we want the least integer m is
that a smaller multiplier may give a smaller shift amount (possibly zero) or
may yield code similar to the "divide by 5" example, rather than the "divide by 7" example. We must have m 2W - 1 so the code has no more
instructions than that of the "divide by 7" example (that is, we can handle a multiplier in the range 2W - 1 to 2W - 1 by means of the add that was
inserted in the "divide by 7" example, but we would rather not deal with larger
multipliers). We must have p
W because
the generated code extracts the left half of the product mn, which is equivalent to shifting right W positions. Thus, the total right shift is W or more positions.
There is a distinction between the multiplier m and the "magic number," denoted M. The magic number is the value used in the multiply instruction. It is given by
Because (1b) must hold for which implies
Let nc
be the largest (positive) value of n such that
(nc, d)
= d - 1. nc
exists because one possibility is nc
= d - 1. It can be calculated from nc is one of the highest d admissible
values of n, so
and clearly
Because (1a) must hold for n = nc,
or
Combining this with (2) gives
Because m is to be the least integer satisfying (4), it is the next integer greater than 2p/d; that is,
Combining this with the right half of (4) and simplifying gives
Thus, the algorithm to find the magic number M and the shift amount s
from d is to first compute nc, and then solve (6) for p by trying successively larger values. If p < W, set p = W (the theorem
below shows that this value of p also satisfies
(6)). When the smallest p W satisfying (6) is found, m is calculated from (5). This is the smallest
possible value of m, because we found the
smallest acceptable p, and from (4) clearly
smaller values of p yield smaller values of m. Finally, s = p - W and M is simply a reinterpretation of m as a signed integer (which is how the mulhs
instruction interprets it).
Forcing p to be at least W is justified by the following:
Theorem DC1. If (6) is true for some value of p, then it is true for all larger values of p.
Proof. Suppose (6) is true for p = p0. Multiplying (6) by 2 gives
From Theorem D5, Combining gives
Therefore, (6) is true for p = p0 + 1, and hence for all larger values.
Thus, one could solve (6) by a binary search, although a simple linear search (starting with p = W) is probably preferable, because usually d is small, and small values of d give small values of p.
We must show that (6) always has a solution
and that 0 m < 2W. (It is not necessary to show that p
W, because that is forced.)
We show that (6) always has a solution by getting an upper bound on p. As a matter of general interest, we also derive a lower bound under the assumption that p is not forced to be at least W. To get these bounds on p, observe that for any positive integer x, there is a power of 2 greater than x and less than or equal to 2x. Hence from (6),
Because 0 rem(2p, d)
d - 1,
From (3a) and (3b), nc
max(2W - 1 - d, d - 1). The lines f1(d) = 2W - 1 - d and f2(d) = d - 1 cross at d = (2W -
1 + 1)/2. Hence nc
(2W - 1 - 1)/2. Because nc is an integer, nc
2W - 2. Because nc, d
2W - 1 - 1, (7) becomes
or
The lower bound p = W - 1 can occur (e.g., for W = 32, d = 3), but in that case we set p = W.
If p is not forced to equal W, then from (4) and (7),
Using (3b) gives
Because nc 2W - 1 - 1(3a),
If p is forced to equal W, then from (4),
Because 2 d
2W - 1 - 1 and nc
2W - 2,
Hence in either case m is within limits for the code schema illustrated by the "divide by 7" example.
We must show that if p and m are calculated from (6) and (5), then equations (1a) and (1b) are satisfied.
Equation (5) and inequality (6) are easily seen to imply (4). (In the case that p is forced to be equal to W, (6) still holds, as shown by Theorem DC1.) In what follows, we consider separately the following five ranges of values of n:
From (4), because m is an integer,
Multiplying by n/2p, for n 0
this becomes
For 0 n
nc, 0
(2p - 1)n/(2pdnc) < 1/d, so by Theorem D4,
Hence (1a) is satisfied in this case (0 n
nc).
For n > nc, n is limited to the range
because n nc
+ d contradicts the choice of nc as the largest value of n such that rem(nc,
d) = d - 1
(alternatively, from (3a), n
nc
+ d implies n
2W - 1). From (4), for n
0,
By elementary algebra, this can be written
From (9), 1 n - nc
d - 1, so
Because nc d - 1 (by (3b)) and (nc
+ 1)/nc has its maximum when nc has its minimum,
In (10), the term (nc + 1)/d is an integer. The term (n - nc)(nc + 1)/dnc is less than or equal to 1. Therefore, (10) becomes
For all n in the range (9), Hence (1a) is satisfied in this case (nc + 1
n
nc
+ d - 1).
For n < 0, from (4) we have, because m is an integer,
Multiplying by n/2p for n < 0 this becomes
or
Using Theorem D2 gives
Because n + 1 0, the right inequality can be
weakened, giving
For -nc n
- 1,
Hence by Theorem D4,
so that (1b) is satisfied in this case (-nc n
-1).
For n < -nc, n is limited to the range
(From (3a), n < - nc - d implies that n < -2W - 1, which is impossible.) Performing elementary algebraic manipulation of the left comparand of (11) gives
For -nc - d n
- nc
- 1,
The ratio (nc + 1)/nc is a maximum when nc is a minimum; that is, nc = d - 1. Therefore,
From (13), because (- nc - 1)/d is an integer and the quantity added to it is between 0 and -1,
For n in the range - nc - d + 1
n
- nc -
1,
Hence 梩hat is, (1b) is
satisfied.
The last case, n
= - nc - d,
can occur only for certain values of d. From
(3a), - nc - d - 2W
- 1, so if n takes on this value, we
must have n = -nc
- d = -2W
- 1, and hence nc = 2W - 1 - d. Therefore, rem(2W
- 1, d)= rem(nc + d, d) = d - 1 (that is, d
divides 2W - 1 + 1).
For this case (n = - nc - d), (6) has the solution p = W - 1(the smallest possible value of p), because for p = W - 1,
Then from (5),
Therefore,
so that (1b) is satisfied.
This completes the proof that if m and p are calculated from (5) and (6), then Equations (1a) and (1b) hold for all admissible values of n.