As a cryptographer, I sometimes get to see some really exciting technology. Last year, for example, I was introduced to "GOST," my first encryption algorithm from the former Soviet Union. The algorithm, whose formal name is "Cryptographic Transformation Algorithm--GOST 28147-89," was published in 1989 by the National Soviet Bureau of Standards (also known as "GOST"). The algorithm was not classified, which implies it was for protection of civilian, not military, data. Don't ask me how it leaked out of the country.

L=_{i}R-1_{i}R=_{i}L-1 XOR f(_{i}R-1,_{i}K)_{i}

Figure 1 is a single round of GOST. First, the right half and the *i*th subkey are added modulo 2^{32} (that's just 32-bit addition, ignoring any carry). The result is then broken into eight 4-bit chunks, each of which becomes the input to a different S-box. There are eight different S-boxes in GOST; the first four bits go in the first S-box, the second four bits go in the second S-box, and so on. Each S-box is a permutation of the numbers 0 through 15. For example, an S-box might be: *7, 10, 2, 4, 15, 9, 0, 3, 6, 12, 5, 13, 1, 8, 11*. In this case, if the input to the S-box is 0, the output is 7. If the input is 1, the output is 10, and so on. All eight S-boxes are different, and they are considered additional key material. The S-boxes are to be kept secret.

The outputs of the eight S-boxes are recombined into a 32-bit word, and then the entire word undergoes an 11-bit circular shift left, towards the higher-order bits. Finally, the result is added modulo 2^{32} to the left half to become the new right half, and the right half becomes the new left half. Do this 32 times and you're done.

The subkeys are generated simply. The 256-bit key is divided into eight 32-bit blocks. The first block is used in the first round, the second block is used in the second round, and so on. At the ninth round and again at the 17th round, the cycle starts again: The first block is used in the ninth and 17th rounds, the second block is used in the 10th and 18th rounds, and so on. For the 25th through 32nd rounds, the order is reversed: The eighth block is used in the 25th round, the seventh block is used in the 26th round, the sixth block is used in the 26th round, and on and on.

For comparison, Figure 2 is a single round of DES. (DES also has an initial and final permutation; these add nothing to the algorithm's security, and are omitted for this discussion.) First, the right half is expanded from 32 to 48 bits by a fixed permutation. The result is XORed with the *i*th subkey, and then broken into eight 6-bit chunks. Each chunk becomes the input to a different S-box. DES has eight different S-boxes; the first six bits go in the first S-box, the second six bits go in the second S-box, and so on. Each S-box is four permutations of the numbers 0 through 15. In DES, the S-boxes are fixed and public; they are part of the standard and are known by everyone.

The outputs of the eight S-boxes are recombined into a 32-bit word, and then the entire word is permuted by another fixed permutation. Finally, the result is added modulo 2^{32} to the left half to become the new right half, and the right half becomes the new left half. This is repeated 16 times.

DES's subkey-generation process is complicated. The 56-bit key is first divided in half. Then, each 28-bit half is circularly shifted to the left by either one or two digits, depending on the round. After the shift, 48 bits are selected by a fixed permutation.

Here again are the major differences between DES and GOST:

- DES has a complicated procedure for generating the subkeys from the keys. GOST has a very simple procedure.
- DES has a 56-bit key; GOST has a 256-bit key. If you add in the secret S-box permutations, GOST has a total of about 610 bits of secret information.
- The S-boxes in DES have 6-bit inputs and 4-bit outputs; the S-boxes in GOST have 4-bit inputs and outputs. Both algorithms have eight S-boxes, but an S-box in GOST is one fourth the size of an S-box in DES.
- DES has an irregular permutation, called a "P-box;" GOST uses an 11-bit left circular shift.
- DES uses XOR to add the key to the right half and to add the right half to the left half; GOST uses addition modulo 232 for both of those operations.
- DES has 16 rounds; GOST has 32.

Is GOST secure? The short answer is that no one knows. Here is a longer answer:

GOST's designers tried to achieve a balance between efficiency and security. They followed the same basic design of DES, but made some modifications to the algorithm. As Listing One shows, the result is an algorithm that is better suited for software implementation (no irritating bit twiddling). They seem to have been less sure of their algorithm's security and have tried to compensate by making the key length very large, keeping the S-boxes secret, and doubling the number of iterations. Whether or not their efforts have resulted in an algorithm more secure than DES remains to be seen.

Figure 1 One round of GOST (O+=addition modulo 2^{32}).

Figure 2 One round of DES (_=exclusive-OR).

/* The GOST 28147-89 cipher by Colin Plumb */ /* If you read the standard, it belabors the point of copying corresponding * bits from point A to point B quite a bit. It helps to understand that * the standard is uniformly little-endian, although it numbers bits from * 1 rather than 0, so bit n has value 2^(n-1). The least significant bit * of the 32-bit words that are manipulated in the algorithm is the first, * lowest-numbered, in the bit string. */ /* A 32-bit data type */ #ifdef __alpha /* Any other 64-bit machines? */ typedef unsigned int word32; #else typedef unsigned long word32; #endif static unsigned char const k8[16] = { 1, 15, 13, 0, 5, 7, 10, 4, 9, 2, 3, 14, 6, 11, 8, 12 }; static unsigned char const k7[16] = { 13, 11, 4, 1, 3, 15, 5, 9, 0, 10, 14, 7, 6, 8, 2, 12 }; static unsigned char const k6[16] = { 4, 11, 10, 0, 7, 2, 1, 13, 3, 6, 8, 5, 9, 12, 15, 14 }; static unsigned char const k5[16] = { 6, 12, 7, 1, 5, 15, 13, 8, 4, 10, 9, 14, 0, 3, 11, 2 }; static unsigned char const k4[16] = { 7, 13, 10, 1, 0, 8, 9, 15, 14, 4, 6, 12, 11, 2, 5, 3 }; static unsigned char const k3[16] = { 5, 8, 1, 13, 10, 3, 4, 2, 14, 15, 12, 7, 6, 0, 9, 11 }; static unsigned char const k2[16] = { 14, 11, 4, 12, 6, 13, 15, 10, 2, 3, 8, 1, 0, 7, 5, 9 }; static unsigned char const k1[16] = { 4, 10, 9, 2, 13, 8, 0, 14, 6, 11, 1, 12, 7, 15, 5, 3 }; /* Byte-at-a-time substitution boxes */ static unsigned char k87[256]; static unsigned char k65[256]; static unsigned char k43[256]; static unsigned char k21[256]; /* Build byte-at-a-time subtitution tables. * This must be called once for global setup. */ void kboxinit(void) { int i; for (i = 0; i < 256; i++) { k87[i] = k8[i >> 4] << 4 | k7[i & 15]; k65[i] = k6[i >> 4] << 4 | k5[i & 15]; k43[i] = k4[i >> 4] << 4 | k3[i & 15]; k21[i] = k2[i >> 4] << 4 | k1[i & 15]; } } /* Do the substitution and rotation that are the core of the operation, like * the expansion, substitution and permutation of the DES. It would be possible * to perform DES-like optimisations and store the table entries as 32-bit * words, already rotated, but the efficiency gain is questionable. * This should be inlined for maximum speed */ #if __GNUC__ __inline__ #endif static word32 f(word32 x) { /* Do substitutions */ #if 0 /* This is annoyingly slow */ x = k8[x>>28 & 15] << 28 | k7[x>>24 & 15] << 24 | k6[x>>20 & 15] << 20 | k5[x>>16 & 15] << 16 | k4[x>>12 & 15] << 12 | k3[x>> 8 & 15] << 8 | k2[x>> 4 & 15] << 4 | k1[x & 15]; #else /* This is faster */ x = k87[x>>24 & 255] << 24 | k65[x>>16 & 255] << 16 | k43[x>> 8 & 255] << 8 | k21[x & 255]; #endif /* Rotate left 11 bits */ return x<<11 | x>>(32-11); } /* The GOST standard defines the input in terms of bits 1..64, with * bit 1 being the lsb of in[0] and bit 64 being the msb of in[1]. * The keys are defined similarly, with bit 256 being the msb of key[7]. */ void gostcrypt(word32 const in[2], word32 out[2], word32 const key[8]) { register word32 n1, n2; /* As named in the GOST */ n1 = in[0]; n2 = in[1]; /* Instead of swapping halves, swap names each round */ n2 ^= f(n1+key[0]); n1 ^= f(n2+key[1]); n2 ^= f(n1+key[2]); n1 ^= f(n2+key[3]); n2 ^= f(n1+key[4]); n1 ^= f(n2+key[5]); n2 ^= f(n1+key[6]); n1 ^= f(n2+key[7]); n2 ^= f(n1+key[0]); n1 ^= f(n2+key[1]); n2 ^= f(n1+key[2]); n1 ^= f(n2+key[3]); n2 ^= f(n1+key[4]); n1 ^= f(n2+key[5]); n2 ^= f(n1+key[6]); n1 ^= f(n2+key[7]); n2 ^= f(n1+key[0]); n1 ^= f(n2+key[1]); n2 ^= f(n1+key[2]); n1 ^= f(n2+key[3]); n2 ^= f(n1+key[4]); n1 ^= f(n2+key[5]); n2 ^= f(n1+key[6]); n1 ^= f(n2+key[7]); n2 ^= f(n1+key[7]); n1 ^= f(n2+key[6]); n2 ^= f(n1+key[5]); n1 ^= f(n2+key[4]); n2 ^= f(n1+key[3]); n1 ^= f(n2+key[2]); n2 ^= f(n1+key[1]); n1 ^= f(n2+key[0]); /* There is no swap after the last round */ out[0] = n2; out[1] = n1; } /* The key schedule is somewhat different for decryption. (The key table is * used once forward and three times backward.) You could define an expanded * key, or just write the code twice, as done here. */ void gostdecrypt(word32 const in[2], word32 out[2], word32 const key[8]) { register word32 n1, n2; /* As named in the GOST */ n1 = in[0]; n2 = in[1]; n2 ^= f(n1+key[0]); n1 ^= f(n2+key[1]); n2 ^= f(n1+key[2]); n1 ^= f(n2+key[3]); n2 ^= f(n1+key[4]); n1 ^= f(n2+key[5]); n2 ^= f(n1+key[6]); n1 ^= f(n2+key[7]); n2 ^= f(n1+key[7]); n1 ^= f(n2+key[6]); n2 ^= f(n1+key[5]); n1 ^= f(n2+key[4]); n2 ^= f(n1+key[3]); n1 ^= f(n2+key[2]); n2 ^= f(n1+key[1]); n1 ^= f(n2+key[0]); n2 ^= f(n1+key[7]); n1 ^= f(n2+key[6]); n2 ^= f(n1+key[5]); n1 ^= f(n2+key[4]); n2 ^= f(n1+key[3]); n1 ^= f(n2+key[2]); n2 ^= f(n1+key[1]); n1 ^= f(n2+key[0]); n2 ^= f(n1+key[7]); n1 ^= f(n2+key[6]); n2 ^= f(n1+key[5]); n1 ^= f(n2+key[4]); n2 ^= f(n1+key[3]); n1 ^= f(n2+key[2]); n2 ^= f(n1+key[1]); n1 ^= f(n2+key[0]); out[0] = n2; out[1] = n1; } /* The GOST "Output feedback" standard. It seems closer morally to the counter * feedback mode some people have proposed for DES. * * The IV is encrypted with the key to produce the initial counter value. * Then, for each output block, a constant is added, modulo 2^32-1 (0 is * represented as all-ones, not all-zeros), to each half of the counter, and * the counter is encrypted to produce the value to XOR with the output. * * Len is the number of blocks. Sub-block encryption is left as an exercise * for the user. Remember that the standard defines everything in a * little-endian manner, so you want to use the low bit of gamma[0] first. * * OFB is, of course, self-inverse, so there is only one function. */ /* The constants for addition */ #define C1 0x01010104 #define C2 0x01010101 void gostofb(word32 const *in, word32 *out, int len, word32 const iv[2], word32 const key[8]) { word32 temp[2]; /* Counter */ word32 gamma[2]; /* Output XOR value */ /* Compute starting value for counter */ gostcrypt(iv, temp, key); while (len--) { temp[0] += C2; if (temp[0] < C2) /* Wrap modulo 2^32? */ temp[0]++; /* Make it modulo 2^32-1 */ temp[1] += C1; if (temp[1] < C1) /* Wrap modulo 2^32? */ temp[1]++; /* Make it modulo 2^32-1 */ gostcrypt(temp, gamma, key); *out++ = *in++ ^ gamma[0]; *out++ = *in++ ^ gamma[1]; } } /* * The CFB mode is just what you'd expect. Each block of ciphertext y[] is * derived from the input x[] by the following pseudocode: * y[i] = x[i] ^ gostcrypt(y[i-1]) * x[i] = y[i] ^ gostcrypt(y[i-1]) * Where y[-1] is the IV. * * The IV is modified in place. Again, len is in *blocks*. */ void gostcfbencrypt(word32 const *in, word32 *out, int len, word32 iv[2], word32 const key[8]) { while (len--) { gostcrypt(iv, iv, key); iv[0] = *out++ ^= iv[0]; iv[1] = *out++ ^= iv[1]; } } void gostcfbdecrypt(word32 const *in, word32 *out, int len, word32 iv[2], word32 const key[8]) { word32 t; while (len--) { gostcrypt(iv, iv, key); t = *out; *out++ ^= iv[0]; iv[0] = t; t = *out; *out++ ^= iv[1]; iv[1] = t; } } /* The message suthetication code uses only 16 of the 32 rounds. There *is* * a swap after the 16th round. The last block should be padded to 64 bits * with zeros. len is the number of *blocks* in the input. */ void gostmac(word32 const *in, int len, word32 out[2], word32 const key[8]) { register word32 n1, n2; /* As named in the GOST */ n1 = 0; n2 = 0; while (len--) { n1 ^= *in++; n2 = *in++; /* Instead of swapping halves, swap names each round */ n2 ^= f(n1+key[0]); n1 ^= f(n2+key[1]); n2 ^= f(n1+key[2]); n1 ^= f(n2+key[3]); n2 ^= f(n1+key[4]); n1 ^= f(n2+key[5]); n2 ^= f(n1+key[6]); n1 ^= f(n2+key[7]); n2 ^= f(n1+key[0]); n1 ^= f(n2+key[1]); n2 ^= f(n1+key[2]); n1 ^= f(n2+key[3]); n2 ^= f(n1+key[4]); n1 ^= f(n2+key[5]); n2 ^= f(n1+key[6]); n1 ^= f(n2+key[7]); } out[0] = n1; out[1] = n2; } #ifdef TEST #include <stdio.h> #include <stdlib.h> /* Designed to cope with 15-bit rand() implementations */ #define RAND32 ((word32)rand() << 17 ^ (word32)rand() << 9 ^ rand()) int main(void) { word32 key[8]; word32 plain[2]; word32 cipher[2]; int i, j; kboxinit(); printf("GOST 21847-89 test driver.\n"); for (i = 0; i < 1000; i++) { for (j = 0; j < 8; j++) key[j] = RAND32; plain[0] = RAND32; plain[1] = RAND32; printf("%3d\r", i); fflush(stdout); gostcrypt(plain, cipher, key); for (j = 0; j < 99; j++) gostcrypt(cipher, cipher, key); for (j = 0; j < 100; j++) gostdecrypt(cipher, cipher, key); if (plain[0] != cipher[0] || plain[1] != cipher[1]) { fprintf(stderr, "\nError! i = %d\n", i); return 1; } } printf("All tests passed.\n"); return 0; } #endif /* TEST */

Copyright © 1995, *Dr. Dobb's Journal*