2-20 Alternating among Two or More Values

Suppose a variable x can have only two possible values a and b, and you wish to assign to x the value other than its current one, and you wish your code to be independent of the values of a and b. For example, in a compiler x might be an opcode that is known to be either branch true or branch false, and whichever it is, you want to switch it to the other. The values of the opcodes branch true and branch false are arbitrary, probably defined by a C #define or enum declaration in a header file.

The straightforward code to do the switch is

if (x == a) x = b; 
else x = a; 

or, as is often seen in C programs,

x = x == a ? b : a; 

A far better (or at least more efficient) way to code it is either

graphics/02icon125.gif

 

If a and b are constants, these require only one or two basic RISC instructions. Of course, overflow in calculating a + b can be ignored.

This raises the question: Is there some particularly efficient way to cycle among three or more values? That is, given three arbitrary but distinct constants a, b, and c, we seek an easy-to-evaluate function f that satisfies

graphics/02icon126.gif

 

It is perhaps interesting to note that there is always a polynomial for such a function. For the case of three constants,

Equation 2

graphics/02icon127.gif

 

(The idea is that if x = a, the first and last terms vanish, and the middle term simplifies to b, and so on.) This requires 14 arithmetic operations to evaluate, and, for arbitrary a, b, and c, the intermediate results exceed the computer's word size. But it is just a quadratic; if written in the usual form for a polynomial and evaluated using Horner's rule, [3] it would require only five arithmetic operations (four for a quadratic with integer coefficients, plus one for a final division). Rearranging Equation (2) accordingly gives

[3] Horner's rule simply factors out x. For example, it evaluates the fourth-degree polynomial ax4 + bx3 + cx2 + dx + e as x(x(x(ax + b) + c) + d) + e. For a polynomial of degree n it takes n multiplications and n additions, and it is very suitable for the multiply-add instruction.

graphics/02icon128.gif

 

This is getting too complicated to be interesting (or practical).

Another method, similar to Equation (2) in that just one of the three terms survives, is

graphics/02icon129.gif

 

This takes 11 instructions if the machine has the equal predicate, not counting loads of constants. Because the two addition operations are combining two 0 values with a nonzero, they can be replaced with or or exclusive or operations.

The formula can be simplified by precalculating a - c and b - c, and then using [GLS1]:

graphics/02icon130.gif

 

Each of these operations takes eight instructions. But on most machines these are probably no better than the straightforward C code shown below, which executes in four to six instructions for small a, b, and c.

if (x == a) x = b; 
else if (x == b) x = c; 
else x = a; 

Pursuing this matter, there is an ingenious branch-free method of cycling among three values on machines that do not have comparison predicate instructions [GLS1]. It executes in eight instructions on most machines.

Because a, b, and c are distinct, there are two bit positions, n1 and n2, where the bits of a, b, and c are not all the same, and where the "odd one out" (the one whose bit differs in that position from the other two) is different in positions n1 and n2. This is illustrated below for the values 21, 31, and 20, shown in binary.

graphics/02icon131.gif

 

Without loss of generality, rename a, b, and c so that a has the odd one out in position n1 and b has the odd one out in position n2, as shown above. Then there are two possibilities for the values of the bits at position n1, namely graphics/02icon132.gifSimilarly, there are two possibilities for the bits at position n2, namely graphics/02icon133.gifThis makes four cases in all, and formulas for each of these cases are shown below:

graphics/02icon134.gif

 

In these formulas, the left operand of each multiplication is a single bit. A multiplication by 0 or 1 may be converted into an and with a value of 0 or all 1's. Thus, the formulas can be rewritten as illustrated below for the first formula:

graphics/02icon135.gif

 

Because all variables except x are constants, this can be evaluated in eight instructions on the basic RISC. Here again, the additions and subtractions can be replaced with exclusive or.

This idea can be extended to cycling among four or more constants. The essence of the idea is to find bit positions n1, n2, ? at which the bits uniquely identify the constants. For four constants, three bit positions always suffice. Then (for four constants) solve the following equation for s, t, u, and v (that is, solve the system of four linear equations in which f(x) is a, b, c, or d, and the coefficients graphics/02icon136.gifare 0 or 1):

graphics/02icon137.gif

 

If the four constants are uniquely identified by only two bit positions, the equation to solve is

graphics/02icon138.gif