*Gene is president of St. George Technologies and the developer of software such as Managing Your Money and the HyperWriter Autolinker. He can be contacted at ecallah@ibm.net.*

- Often, solving a general problem is easier than solving a specific instance of that problem. The algorithm employed here is easier to understand than a hardwired one, and shouldn't take any longer to get right.
- The key to solving many problems is getting the right data structures--here, given the data structures, the code follows naturally.

To convert from native numeric representation back to a string representation in base-X, a string laying out the ordering of characters in base-X is the natural mapping mechanism. The base-X digits of the native number itself, "peeled off" modulus the powers of the radix X, can serve as indexes into the mapping string. This takes advantage of the relationship between a base and the modulus and division operations--for any base X and number N, repeated modulus and division operations on N by X will produce the value of each base-X digit of N from right to left. The structures I came up with, along with a representative instance, appear in Listing One.

The base representation in the listing has long runs of consecutive values, but this need not be the case. If the application involved something like encryption, the values assigned to different characters could just as well be randomly generated.

To map from string to number, *BaseXToLong()* (see Listing Two) moves through the string representing the number from right to left, indexing by the character at each position into the structure *NumValues*. That character's value in the current base is stored in *NumValues* and then multiplied by the power of the radix equivalent to that position in the string. In other words, the numeric value of the last character in the string is multiplied by the 0th power of the radix; the next to last, by the first; the next, by the square of the radix; and so on.

*LongToBaseX()* maps in the other direction, using the structure *CharValues*. Again, the function operates from right to left, filling the last character of the string first. To determine the character in the last place, I take the long argument (named "number") modulus the base, and index *CharValues* with the result. For the next character, I divide number by the radix, essentially dropping off the last digit, and repeat the modulus step; see Listing Three.

Finally, I had to implement the increment function itself. I wrote it as a single statement using the primitives discussed earlier. The code has now gained a level of abstraction. The caller of *LongToBaseX()* has to worry about details like the length of the return value and where to store it. But at the level of increment abstraction, I knew the answers to these questions--the length is the length of the original key, and it is stored back in the original buffer. I left the low-level routines flexible about the answer to these questions, and built another layer of abstraction to hide these implementation details from the layers above it; see Example 1.

I wrote this function as a single statement because it concisely states my algorithm. The calls to *BaseXToLong()* and *strlen()* are made only for their return values, not for any side effect such as I/O or assignment; hence their position inside the call to *LongXToBase()*. Since I won't otherwise use any intermediate results, storing them would be deceptive. Also, side effects complicate the proof of a function's correctness and impede operations such as shipping the calculation of each of a function's arguments off to separate processors--one function may depend on the side effect of another. By making explicit the precedence of calls to *BaseXToLong()*, *strlen()*, and *LongXToBase()*, this form suggests a macro (or template, in C++) implementation of a family of *BaseXIncr* functions. These functions add one to a macro argument, rather than the return of *BaseXToLong()*. Example 2 is a function that allows key comparison, even of keys stored in different bases.

This basic idea can be implemented in several different ways. For instance, the radix could be passed as an argument instead of being in the *BaseRep* structure; thus, one *BaseRep* structure could work for many different radices. The base representation could be read in from a file for further flexibility. C++ users should easily be able to turn this into a class, packaging together the *BaseRep* pointer and the functions operating on it. You might write the class so that the radix could be reset, and perhaps *BaseXToLong()* and *LongToBaseX()* would be protected virtual members, so that you could override them in descendant classes.

**Example 1:** Hiding details from the above layers.

char* BaseXIncr(char* number, BaseRep* br) { return LongToBaseX(BaseXToLong (number, br) + 1, number, strlen(number), br); }

int LessThan(char* num1, BaseRep* br1, char* num2, BaseRep* br2) { return BaseXToLong(num1, br1) < BaseXToLong(num2, br2); }

/* given an ASCII character, what is its value in the base? */ #define MAX_BASE 256 typedef struct baserep { int Radix; int NumValues[MAX_BASE]; char CharValues[MAX_BASE + 1]; } BaseRep; BaseRep Base36DigitsLower = { { 36 }, /* 0 - 47 */ {-1, -1, -1, /* ...and so on, until 48 negative ones in all*/ /* 48 ('0') - 57 ('9') */ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, /* 58 - 96: another run of -1 */ /* 97 ('a') - 122 ('z') */ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, /* 123 - 255: whole bunch o' negative ones go here */}, {"0123456789abcdefghijklmnopqrstuvwxyz"} };

long BaseXToLong(char* number, BaseRep* br) { long PowerOfRadix; long ret = 0; int i; int digits = strlen(number); for(i = digits - 1, PowerOfRadix = 1; i >= 0; i--, PowerOfRadix *= br->Radix) ret += (br->NumValues[(int)number[i]] * PowerOfRadix); return ret; }

char* LongToBaseX(long number, char* buf, int buf_len, BaseRep* br) { int i; for(i = buf_len - 1; i >= 0; i--) { buf[i] = br->CharValues[number % br->Radix]; number /= br->Radix; } if(number) return 0; /* null: error return */ return buf; /* so function will generate rvalue */ }

Copyright © 1995, *Dr. Dobb's Journal*