To find the (x, y) coordinates of a point located at a distance s along the order n Hilbert curve, observe that the most significant two bits of the 2nbit integer s determine which major quadrant the point is in. This is because the Hilbert curve of any order follows the overall pattern of the order 1 curve. If the most significant two bits of s are 00, the point is somewhere in the lower left quadrant, if 01 it is in the upper left quadrant, if 10 it is in the upper right quadrant, and if 11 it is in the lower right quadrant. Thus, the most significant two bits of s determine the most significant bits of the nbit integers x and y, as follows:
Most significant two bits of s 
Most significant bits of (x, y) 
00 
(0, 0) 
01 
(0, 1) 
10 
(1, 1) 
11 
(1, 0) 
In any Hilbert curve, only four of the eight possible Ushapes occur. These are shown in Table 141 as graphics and as maps from two bits of s to a single bit of each of x and y.
Observe from Figure 142 that in all cases, the Ushape represented by map becomes, at the next level of detail, a Ushape represented by maps B, A, A, or D, depending on whether the length traversed in the firstmentioned map A is 0, 1, 2, or 3, respectively. Similarly, a Ushape represented by map becomes, at the next level of detail, a Ushape represented by maps A, B, B, or C, depending on whether the length traversed in the firstmentioned map B is 0, 1, 2, or 3, respectively.
These observations lead to the state transition table shown in Table 142, in which the states correspond to the mappings shown in Table 141.
Table 142. State transition table for computing (X, Y) from S 

If the current state is 
and the next (to right) two bits of s are 
then append to (x, y) 
and enter state 
A 
00 
(0, 0) 
B 
A 
01 
(0, 1) 
A 
A 
10 
(1, 1) 
A 
A 
11 
(1, 0) 
D 
B 
00 
(0, 0) 
A 
B 
01 
(1, 0) 
B 
B 
10 
(1, 1) 
B 
B 
11 
(0, 1) 
C 
C 
00 
(1, 1) 
D 
C 
01 
(1, 0) 
C 
C 
10 
(0, 0) 
C 
C 
11 
(0, 1) 
B 
D 
00 
(1, 1) 
C 
D 
01 
(0, 1) 
D 
D 
10 
(0, 0) 
D 
D 
11 
(1, 0) 
A 
To use the table, start in state A. The integer s should be padded with leading zeros so that its length is 2n, where n is the order of the Hilbert curve. Scan the bits of s in pairs from left to right. The first row of Table 142 means that if the current state is A and the currently scanned bits of s are 00, then output (0, 0) and enter state B. Then, advance to the next two bits of s. Similarly, the second row means that if the current state is A and the scanned bits are 01, then output (0, 1) and stay in state A.
The output bits are accumulated in lefttoright order. When the end of s is reached, the nbit output quantities x and y are defined.
As an example, suppose n = 3 and
Because the process starts in state A and the initial bits scanned are 11, the process outputs (1, 0) and enters state D (fourth row). Then, in state D and scanning 01, the process outputs (0, 1) and stays in state D. Lastly, the process outputs (1, 1) and enters state C, although the state is now immaterial.
Thus, the output is (101, 011)梩hat is, x = 5 and y = 3.
A C program implementing these steps is shown in Figure 145. In this program, the current state is represented by an integer from 0 to 3 for states A through D, respectively. In the assignment to variable row, the current state is concatenated with the next two bits of s, giving an integer from 0 to 15, which is the applicable row number in Table 142. Variable row is used to access integers (expressed in hexadecimal) that are used as bit strings to represent the rightmost two columns of Table 142; that is, these accesses are inregister table lookups. Lefttoright in the hexadecimal values corresponds to bottomtotop in Table 142.
void hil_xy_from_s(unsigned s, int n, unsigned *xp,
牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋unsigned *yp) {
牋 int i;
牋爑nsigned state, x, y, row;
牋 state = 0;牋牋牋牋牋牋牋牋牋牋牋牋牋 ?/ Initialize.
牋爔 = y = 0;
牋 for (i = 2*n  2; i >= 0; i = 2) {牋 // Do n times.
牋牋牋row = 4*state  (s >> i) & 3;牋牋?// Row in table.
牋牋牋x = (x << 1)  (0x936C >> row) & 1;
牋牋牋y = (y << 1)  (0x39C6 >> row) & 1;
牋牋牋state = (0x3E6B94C1 >> 2*row) & 3; // New state.
牋爙
牋?xp = x;牋牋牋牋牋牋牋牋牋牋牋牋牋牋?// Pass back
牋?yp = y;牋牋牋牋牋牋牋牋牋牋牋牋牋牋?// results.
}
[L&S] give a quite different algorithm. Unlike the algorithm of Figure 145, it scans the bits of s from right to left. It is based on the observation that one can map the least significant two bits of s to (x, y) based on the order 1 Hilbert curve, and then test the next two bits of s to the left. If they are 00, the values of x and y just computed should be interchanged, which corresponds to reflecting the order 1 Hilbert curve about the line x = y. (Refer to the curves of orders 1 and 2 shown in Figure 141 on page 241.) If these two bits are 01 or 10, the values of x and y are not changed. If they are 11, the values of x and y are interchanged and complemented. These same rules apply as one progresses leftward along the bits of s. They are embodied in Table 143 and the code of Figure 146. It is somewhat curious that the bits can be prepended to x and y first, and then the swap and complement operations can be done, including these newly prepended bits; the results are the same.
void hil_xy_from_s(unsigned s, int n, unsigned *xp,
牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋unsigned *yp) {
牋 int i, sa, sb;
牋爑nsigned x, y, temp;
牋 for (i = 0; i < 2*n; i += 2) {
牋牋?sa = (s >> (i+1)) & 1;牋牋?// Get bit i+1 of s.
牋牋牋sb = (s >> i) & 1;牋牋牋牋?// Get bit i of s.
牋牋?if ((sa ^ sb) == 0) {牋牋牋 // If sa,sb = 00 or 11,
牋牋牋牋爐emp = x;牋牋牋牋牋牋牋?// swap x and y,
牋牋牋牋爔 = y^(sa);牋牋牋牋牋牋 // and if sa = 1,
牋牋牋牋爕 = temp^(sa);牋牋牋牋?// complement them.
牋牋牋}
牋牋牋x = (x >> 1)  (sa << 31);?// Prepend sa to x and
牋牋牋y = (y >> 1)  ((sa ^ sb) << 31); // (sa^sb) to y.
牋爙
牋?xp = x >> (32  n);牋牋牋牋牋 // Rightadjust x and y
牋?yp = y >> (32  n);牋牋牋牋牋 // and return them to
}牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋 // the caller.
In Figure 146, variables x and y are uninitialized, which might cause an error message from some compilers. But the code functions correctly for whatever values x and y have initially.
The branch in the loop of Figure 146 can be avoided by doing the swap operation with the "three exclusive or" trick given in Section 219 on page 38.
The if block can be replaced by the following code, where swap and cmpl are unsigned integers:
swap = (sa ^ sb)  1;?// 1 if should swap, else 0.
cmpl = (sa & sb);牋牋 // 1 if should compl't, else 0.
x = x ^ y;
y = y ^ (x & swap) ^ cmpl;
x = x ^ y;
However, this is nine instructions, versus about two or six for the if block, so the branch cost would have to be quite high for this to be a good choice.
The "swap and complement" idea of [L&S] suggests a logic circuit for generating the Hilbert curve. The idea behind the circuit described below is that as you trace along the path of an order n curve, you basically map pairs of bits of s to (x, y) according to map A of Table 141. As the trace enters various regions, however, the mapping output gets swapped, complemented, or both. The circuit of Figure 147 keeps track of the swap and complement requirements of each stage, uses the appropriate mapping to map two bits of s to (x_{i}, y_{i}), and generates the swap and complement signals for the next stage.
Assume there is a register containing the path length s and circuits for incrementing it. Then, to find the next point on the Hilbert curve, first increment s and then transform it as described in Table 144. This is a lefttoright process, which is a bit of a problem because incrementing s is a righttoleft process. Thus, the time to generate a new point on an order n Hilbert curve is proportional to 2n (for incrementing s) plus n (for transforming s to (x, y)).
Figure 147 shows this computation as a logic circuit. In this figure, S denotes the swap signal and C denotes the complement signal.
Table 144. Logic for computing (X, Y) from S 

If the next (to right) two bits of s are 
then append to (x, y) 
and set 
00 
(0, 0)[*] 

01 
(0, 1)[*] 
No change 
10 
(1, 1)[*] 
No change 
11 
(1, 0)[*] 

^{[*]} 
^{[*]} Possibly swapped and/or complemented
The logic circuit of Figure 147 suggests another way to compute (x, y) from s. Notice how the swap and complement signals propagate from left to right through the n stages. This suggests that it may be possible to use the parallel prefix operation to quickly (in log_{2}n steps rather than n  1) propagate the swap and complement information to each stage, and then do some wordparallel logical operations to compute x and y, using the equations in Figure 147. The values of x and y are intermingled in the even and odd bit positions of a word, so they have to be separated by the unshuffle operation (see page 107). This might seem a bit complicated, and likely to pay off only for rather large values of n, but let us see how it goes.
A procedure for this operation is shown in Figure 148 [GLS1]. The procedure operates on fullword quantities, so it first pads the input s on the left with '01' bits. This bit combination does not affect the swap and complement quantities. Next, a quantity cs (complementswap) is computed. This word is of the form cscs...cs, where each c (a single bit), if 1, means that the corresponding pair of bits is to be complemented, and each s means that the corresponding pair of bits is to be swapped, following Table 143. In other words, these two statements map each pair of bits of s as follows:
s_{2i} _{+ 1} 
s_{2i} 
cs 
0 
0 
01 
0 
1 
00 
1 
0 
00 
1 
1 
11 
void hil_xy_from_s(unsigned s, int n, unsigned *xp,
牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋unsigned *yp) {
牋 unsigned comp, swap, cs, t, sr;
牋 s = s  (0x55555555 << 2*n); // Pad s on left with 01
牋爏r = (s >> 1) & 0x55555555;?// (no change) groups.
牋燾s = ((s & 0x55555555) + sr) // Compute complement &
牋牋牋牋^ 0x55555555;牋牋牋牋牋 // swap info in twobit
牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋// groups.
牋?/ Parallel prefix xor op to propagate both complement
牋?/ and swap info together from left to right (there is
牋?/ no step "cs ^= cs >> 1", so in effect it computes
牋?/ two independent parallel prefix operations on two
牋?/ interleaved sets of sixteen bits).
牋 cs = cs ^ (cs >> 2);
牋燾s = cs ^ (cs >> 4);
牋燾s = cs ^ (cs >> 8);
牋燾s = cs ^ (cs >> 16);
牋爏wap = cs & 0x55555555;牋牋?// Separate the swap and
牋燾omp = (cs >> 1) & 0x55555555;?// complement bits.
牋 t = (s & swap) ^ comp;牋牋牋 // Calculate x and y in
牋爏 = s ^ sr ^ t ^ (t << 1);牋 // the odd & even bit
牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋// positions, resp.
牋爏 = s & ((1 << 2*n)  1);牋?// Clear out any junk
牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋// on the left (unpad).
牋 // Now "unshuffle" to separate the x and y bits.
牋 t = (s ^ (s >> 1)) & 0x22222222; s = s ^ t ^ (t << 1);
牋爐 = (s ^ (s >> 2)) & 0x0C0C0C0C; s = s ^ t ^ (t << 2);
牋爐 = (s ^ (s >> 4)) & 0x00F000F0; s = s ^ t ^ (t << 4);
牋爐 = (s ^ (s >> 8)) & 0x0000FF00; s = s ^ t ^ (t << 8);
牋 *xp = s >> 16;牋牋牋牋牋牋牋 // Assign the two halves
牋?yp = s & 0xFFFF;牋牋牋牋牋?// of t to x and y.
}
This is the quantity to which we want to apply the parallel prefix operation. PPXOR is the one to use, going from left to right, because successive 1bits meaning to complement or to swap have the same logical properties as exclusive or: Two successive 1bits cancel each other.
Both signals (complement and swap) are propagated in the same PPXOR operation, each working with every other bit of cs.
The next four assignment statements have the effect of translating each pair of bits of s into (x, y) values, with x being in the odd (leftmost) bit positions, and y being in the even bit positions. Although the logic may seem obscure, it is not difficult to verify that each pair of bits of s is transformed by the logic of the first two Boolean equations in Figure 147. (Suggestion: Consider separately how the even and odd bit positions are transformed, using the fact that t and sr are 0 in the odd positions.)
The rest of the procedure is selfexplanatory. It executes in 66 basic RISC instructions (constant, branchfree), versus about 19n + 10 (average) for the code of Figure 146 (based on compiled code; includes prologs and epilogs, which are essentially nil). Thus, the parallel prefix method is faster for n 3.