# ALGORITHM ALLEY

## The GOST Encryption Algorithm

### Bruce Schneier

In this month's column, I'll describe GOST and its relation to DES, and examine its security. GOST is still new to Western cryptographers, and more-detailed information about its security is sure to emerge as more people get the chance to study it.

### A Description of GOST

GOST is a secret-key algorithm, similar in construction to DES. I believe it was invented by Soviet cryptographers who had examined DES. Its publication in 1989 as a Soviet standard occurred well after DES became public, but there is no way of knowing when it was developed and for how long it was used within the Soviet government.

Also like DES, GOST is a Feistel network; the algorithm iterates a simple encryption algorithm for multiple rounds. The text is first broken up into a left half, L, and a right half, R. K is the subkey for that particular round. A round, i, of either algorithm looks like this:

```Li=Ri-1
Ri=Li-1 XOR f(Ri-1,Ki)```

DES has 16 rounds; GOST has 32. Both algorithms can be reversed easily: Decryption is the same as encryption with the order of the K reversed.

Figure 1 is a single round of GOST. First, the right half and the ith subkey are added modulo 232 (that's just 32-bit addition, ignoring any carry). The result is then broken into eight 4-bit chunks, each of which becomes the input to a different S-box. There are eight different S-boxes in GOST; the first four bits go in the first S-box, the second four bits go in the second S-box, and so on. Each S-box is a permutation of the numbers 0 through 15. For example, an S-box might be: 7, 10, 2, 4, 15, 9, 0, 3, 6, 12, 5, 13, 1, 8, 11. In this case, if the input to the S-box is 0, the output is 7. If the input is 1, the output is 10, and so on. All eight S-boxes are different, and they are considered additional key material. The S-boxes are to be kept secret.

The outputs of the eight S-boxes are recombined into a 32-bit word, and then the entire word undergoes an 11-bit circular shift left, towards the higher-order bits. Finally, the result is added modulo 232 to the left half to become the new right half, and the right half becomes the new left half. Do this 32 times and you're done.

The subkeys are generated simply. The 256-bit key is divided into eight 32-bit blocks. The first block is used in the first round, the second block is used in the second round, and so on. At the ninth round and again at the 17th round, the cycle starts again: The first block is used in the ninth and 17th rounds, the second block is used in the 10th and 18th rounds, and so on. For the 25th through 32nd rounds, the order is reversed: The eighth block is used in the 25th round, the seventh block is used in the 26th round, the sixth block is used in the 26th round, and on and on.

### Comparison with DES

For comparison, Figure 2 is a single round of DES. (DES also has an initial and final permutation; these add nothing to the algorithm's security, and are omitted for this discussion.) First, the right half is expanded from 32 to 48 bits by a fixed permutation. The result is XORed with the ith subkey, and then broken into eight 6-bit chunks. Each chunk becomes the input to a different S-box. DES has eight different S-boxes; the first six bits go in the first S-box, the second six bits go in the second S-box, and so on. Each S-box is four permutations of the numbers 0 through 15. In DES, the S-boxes are fixed and public; they are part of the standard and are known by everyone.

The outputs of the eight S-boxes are recombined into a 32-bit word, and then the entire word is permuted by another fixed permutation. Finally, the result is added modulo 232 to the left half to become the new right half, and the right half becomes the new left half. This is repeated 16 times.

DES's subkey-generation process is complicated. The 56-bit key is first divided in half. Then, each 28-bit half is circularly shifted to the left by either one or two digits, depending on the round. After the shift, 48 bits are selected by a fixed permutation.

Here again are the major differences between DES and GOST:

• DES has a complicated procedure for generating the subkeys from the keys. GOST has a very simple procedure.
• DES has a 56-bit key; GOST has a 256-bit key. If you add in the secret S-box permutations, GOST has a total of about 610 bits of secret information.
• The S-boxes in DES have 6-bit inputs and 4-bit outputs; the S-boxes in GOST have 4-bit inputs and outputs. Both algorithms have eight S-boxes, but an S-box in GOST is one fourth the size of an S-box in DES.
• DES has an irregular permutation, called a "P-box;" GOST uses an 11-bit left circular shift.
• DES uses XOR to add the key to the right half and to add the right half to the left half; GOST uses addition modulo 232 for both of those operations.
• DES has 16 rounds; GOST has 32.

### Generating GOST's S-Boxes

The GOST standard does not discuss how to generate the S-boxes, only that they are somehow supplied. Recent discoveries of differential and linear cryptanalysis show that the choice of S-boxes greatly affects security; there are good S-boxes and bad S-boxes. The same holds true for GOST, leading some to speculate that the erstwhile Soviets would supply good S-boxes to those organizations it liked and bad S-boxes to those organizations it wished to eavesdrop on. This may very well be true, but further conversations with a GOST chip manufacturer within Russia offered another alternative. He generated the S-box permutations himself, using a random-number generator. The S-boxes in the accompanying source code were used in a Soviet banking application.

### Is it Secure?

Is GOST secure? The short answer is that no one knows. Here is a longer answer:

Assuming there is no better way to break GOST than brute force, it is certainly much more secure than DES. GOST has a 256-bit key--longer if you count the secret S-boxes. Against differential and linear cryptanalysis, GOST is probably stronger. Although the random S-boxes in GOST are probably weaker than the fixed S-boxes in DES, they are part of the key. The secrecy of GOST's S-boxes adds to its resistance against differential and linear attacks. Also, both of these attack are very much dependent on the number of rounds in the algorithm: The more rounds, the harder the attack. GOST has twice as many rounds as DES, and this probably makes both differential and linear cryptanalysis infeasible.

The other building blocks are either on par or worse. GOST doesn't have the same expansion permutation that DES has. It is known that deleting this permutation weakens DES by reducing the avalanche effect; it is reasonable to believe that GOST is weaker for not having it. GOST's use of addition instead of exclusive-OR is probably no different.

The most serious change seems to be after the S-boxes: GOST's cyclic shift instead of a permutation. The DES permutation is designed to increase the avalanche effect: the rate in which a single-bit change in the input affects bits in the output. In GOST a change in one input bit affects one S-box in one round, which then affects two S-boxes in the next round, three the round after that, and so on. GOST requires eight rounds before a single change in an input affects every output bit; DES only requires five rounds. This is certainly a weakness. But remember that GOST has 32 rounds to DES's 16.

### Conclusions

GOST's designers tried to achieve a balance between efficiency and security. They followed the same basic design of DES, but made some modifications to the algorithm. As Listing One shows, the result is an algorithm that is better suited for software implementation (no irritating bit twiddling). They seem to have been less sure of their algorithm's security and have tried to compensate by making the key length very large, keeping the S-boxes secret, and doubling the number of iterations. Whether or not their efforts have resulted in an algorithm more secure than DES remains to be seen.

Figure 1 One round of GOST (O+=addition modulo 232).

Figure 2 One round of DES (_=exclusive-OR).

#### Listing One

```
/* The GOST 28147-89 cipher  by Colin Plumb */

/* If you read the standard, it belabors the point of copying corresponding
* bits from point A to point B quite a bit.  It helps to understand that
* the standard is uniformly little-endian, although it numbers bits from
* 1 rather than 0, so bit n has value 2^(n-1).  The least significant bit
* of the 32-bit words that are manipulated in the algorithm is the first,
* lowest-numbered, in the bit string.
*/

/* A 32-bit data type */
#ifdef __alpha  /* Any other 64-bit machines? */
typedef unsigned int word32;
#else
typedef unsigned long word32;
#endif

static unsigned char const k8[16] = {
1, 15, 13, 0,  5, 7, 10,  4,  9, 2,  3, 14,  6,  11,  8,  12 };
static unsigned char const k7[16] = {
13, 11,  4, 1,  3, 15,  5,  9,  0,  10,  14, 7, 6,  8,  2, 12 };
static unsigned char const k6[16] = {
4,  11,  10, 0,  7,  2, 1,  13,  3, 6, 8,  5, 9,  12,  15,  14 };
static unsigned char const k5[16] = {
6, 12, 7,  1,  5,  15,  13, 8,  4,  10,  9,  14, 0, 3,  11, 2 };
static unsigned char const k4[16] = {
7, 13,  10,  1,  0, 8, 9,  15,  14,  4,  6, 12, 11,  2, 5,  3 };
static unsigned char const k3[16] = {
5,  8, 1, 13,  10,  3,  4,  2,  14, 15,  12,  7, 6,  0,  9, 11 };
static unsigned char const k2[16] = {
14, 11,  4, 12, 6,  13,  15, 10,  2, 3,  8,  1,  0, 7,  5,  9 };
static unsigned char const k1[16] = {
4,  10,  9,  2,  13, 8, 0,  14,  6,  11,  1, 12,  7, 15, 5,  3 };

/* Byte-at-a-time substitution boxes */
static unsigned char k87[256];
static unsigned char k65[256];
static unsigned char k43[256];
static unsigned char k21[256];

/* Build byte-at-a-time subtitution tables.
* This must be called once for global setup.
*/
void
kboxinit(void)
{
int i;
for (i = 0; i < 256; i++) {
k87[i] = k8[i >> 4] << 4 | k7[i & 15];
k65[i] = k6[i >> 4] << 4 | k5[i & 15];
k43[i] = k4[i >> 4] << 4 | k3[i & 15];
k21[i] = k2[i >> 4] << 4 | k1[i & 15];
}
}

/* Do the substitution and rotation that are the core of the operation, like
* the expansion, substitution and permutation of the DES. It would be possible
* to perform DES-like optimisations and store the table entries as 32-bit
* words, already rotated, but the efficiency gain is questionable.
* This should be inlined for maximum speed
*/
#if __GNUC__
__inline__
#endif
static word32
f(word32 x)
{
/* Do substitutions */
#if 0
/* This is annoyingly slow */
x = k8[x>>28 & 15] << 28 | k7[x>>24 & 15] << 24 |
k6[x>>20 & 15] << 20 | k5[x>>16 & 15] << 16 |
k4[x>>12 & 15] << 12 | k3[x>> 8 & 15] <<  8 |
k2[x>> 4 & 15] <<  4 | k1[x     & 15];
#else
/* This is faster */
x = k87[x>>24 & 255] << 24 | k65[x>>16 & 255] << 16 |
k43[x>> 8 & 255] <<  8 | k21[x & 255];
#endif

/* Rotate left 11 bits */
return x<<11 | x>>(32-11);
}

/* The GOST standard defines the input in terms of bits 1..64, with
* bit 1 being the lsb of in[0] and bit 64 being the msb of in[1].
* The keys are defined similarly, with bit 256 being the msb of key[7].
*/
void
gostcrypt(word32 const in[2], word32 out[2], word32 const key[8])
{
register word32 n1, n2; /* As named in the GOST */

n1 = in[0];
n2 = in[1];

/* Instead of swapping halves, swap names each round */
n2 ^= f(n1+key[0]);
n1 ^= f(n2+key[1]);
n2 ^= f(n1+key[2]);
n1 ^= f(n2+key[3]);
n2 ^= f(n1+key[4]);
n1 ^= f(n2+key[5]);
n2 ^= f(n1+key[6]);
n1 ^= f(n2+key[7]);

n2 ^= f(n1+key[0]);
n1 ^= f(n2+key[1]);
n2 ^= f(n1+key[2]);
n1 ^= f(n2+key[3]);
n2 ^= f(n1+key[4]);
n1 ^= f(n2+key[5]);
n2 ^= f(n1+key[6]);
n1 ^= f(n2+key[7]);

n2 ^= f(n1+key[0]);
n1 ^= f(n2+key[1]);
n2 ^= f(n1+key[2]);
n1 ^= f(n2+key[3]);
n2 ^= f(n1+key[4]);
n1 ^= f(n2+key[5]);
n2 ^= f(n1+key[6]);
n1 ^= f(n2+key[7]);

n2 ^= f(n1+key[7]);
n1 ^= f(n2+key[6]);
n2 ^= f(n1+key[5]);
n1 ^= f(n2+key[4]);
n2 ^= f(n1+key[3]);
n1 ^= f(n2+key[2]);
n2 ^= f(n1+key[1]);
n1 ^= f(n2+key[0]);

/* There is no swap after the last round */
out[0] = n2;
out[1] = n1;
}

/* The key schedule is somewhat different for decryption. (The key table is
* used once forward and three times backward.) You could define an expanded
* key, or just write the code twice, as done here.
*/
void
gostdecrypt(word32 const in[2], word32 out[2], word32 const key[8])
{
register word32 n1, n2; /* As named in the GOST */

n1 = in[0];
n2 = in[1];

n2 ^= f(n1+key[0]);
n1 ^= f(n2+key[1]);
n2 ^= f(n1+key[2]);
n1 ^= f(n2+key[3]);
n2 ^= f(n1+key[4]);
n1 ^= f(n2+key[5]);
n2 ^= f(n1+key[6]);
n1 ^= f(n2+key[7]);

n2 ^= f(n1+key[7]);
n1 ^= f(n2+key[6]);
n2 ^= f(n1+key[5]);
n1 ^= f(n2+key[4]);
n2 ^= f(n1+key[3]);
n1 ^= f(n2+key[2]);
n2 ^= f(n1+key[1]);
n1 ^= f(n2+key[0]);

n2 ^= f(n1+key[7]);
n1 ^= f(n2+key[6]);
n2 ^= f(n1+key[5]);
n1 ^= f(n2+key[4]);
n2 ^= f(n1+key[3]);
n1 ^= f(n2+key[2]);
n2 ^= f(n1+key[1]);
n1 ^= f(n2+key[0]);

n2 ^= f(n1+key[7]);
n1 ^= f(n2+key[6]);
n2 ^= f(n1+key[5]);
n1 ^= f(n2+key[4]);
n2 ^= f(n1+key[3]);
n1 ^= f(n2+key[2]);
n2 ^= f(n1+key[1]);
n1 ^= f(n2+key[0]);

out[0] = n2;
out[1] = n1;
}

/* The GOST "Output feedback" standard. It seems closer morally to the counter
* feedback mode some people have proposed for DES.
*
* The IV is encrypted with the key to produce the initial counter value.
* Then, for each output block, a constant is added, modulo 2^32-1 (0 is
* represented as all-ones, not all-zeros), to each half of the counter, and
* the counter is encrypted to produce the value to XOR with the output.
*
* Len is the number of blocks. Sub-block encryption is left as an exercise
* for the user. Remember that the standard defines everything in a
* little-endian manner, so you want to use the low bit of gamma[0] first.
*
* OFB is, of course, self-inverse, so there is only one function.
*/

/* The constants for addition */
#define C1 0x01010104
#define C2 0x01010101

void
gostofb(word32 const *in, word32 *out, int len,
word32 const iv[2], word32 const key[8])
{
word32 temp[2];         /* Counter */
word32 gamma[2];        /* Output XOR value */

/* Compute starting value for counter */
gostcrypt(iv, temp, key);

while (len--) {
temp[0] += C2;
if (temp[0] < C2)       /* Wrap modulo 2^32? */
temp[0]++;      /* Make it modulo 2^32-1 */
temp[1] += C1;
if (temp[1] < C1)       /* Wrap modulo 2^32? */
temp[1]++;      /* Make it modulo 2^32-1 */

gostcrypt(temp, gamma, key);

*out++ = *in++ ^ gamma[0];
*out++ = *in++ ^ gamma[1];
}
}

/*
* The CFB mode is just what you'd expect.  Each block of ciphertext y[] is
* derived from the input x[] by the following pseudocode:
* y[i] = x[i] ^ gostcrypt(y[i-1])
* x[i] = y[i] ^ gostcrypt(y[i-1])
* Where y[-1] is the IV.
*
* The IV is modified in place.  Again, len is in *blocks*.
*/

void
gostcfbencrypt(word32 const *in, word32 *out, int len,
word32 iv[2], word32 const key[8])
{
while (len--) {
gostcrypt(iv, iv, key);
iv[0] = *out++ ^= iv[0];
iv[1] = *out++ ^= iv[1];
}
}
void
gostcfbdecrypt(word32 const *in, word32 *out, int len,
word32 iv[2], word32 const key[8])
{
word32 t;
while (len--) {
gostcrypt(iv, iv, key);
t = *out;
*out++ ^= iv[0];
iv[0] = t;
t = *out;
*out++ ^= iv[1];
iv[1] = t;
}
}

/* The message suthetication code uses only 16 of the 32 rounds. There *is*
* a swap after the 16th round. The last block should be padded to 64 bits
* with zeros. len is the number of *blocks* in the input.
*/
void
gostmac(word32 const *in, int len, word32 out[2], word32 const key[8])
{
register word32 n1, n2; /* As named in the GOST */

n1 = 0;
n2 = 0;

while (len--) {
n1 ^= *in++;
n2 = *in++;

/* Instead of swapping halves, swap names each round */
n2 ^= f(n1+key[0]);
n1 ^= f(n2+key[1]);
n2 ^= f(n1+key[2]);
n1 ^= f(n2+key[3]);
n2 ^= f(n1+key[4]);
n1 ^= f(n2+key[5]);
n2 ^= f(n1+key[6]);
n1 ^= f(n2+key[7]);

n2 ^= f(n1+key[0]);
n1 ^= f(n2+key[1]);
n2 ^= f(n1+key[2]);
n1 ^= f(n2+key[3]);
n2 ^= f(n1+key[4]);
n1 ^= f(n2+key[5]);
n2 ^= f(n1+key[6]);
n1 ^= f(n2+key[7]);
}
out[0] = n1;
out[1] = n2;
}

#ifdef TEST

#include <stdio.h>
#include <stdlib.h>

/* Designed to cope with 15-bit rand() implementations */
#define RAND32 ((word32)rand() << 17 ^ (word32)rand() << 9 ^ rand())

int
main(void)
{
word32 key[8];
word32 plain[2];
word32 cipher[2];
int i, j;

kboxinit();

printf("GOST 21847-89 test driver.\n");

for (i = 0; i < 1000; i++) {
for (j = 0; j < 8; j++)
key[j] = RAND32;
plain[0] = RAND32;
plain[1] = RAND32;

printf("%3d\r", i);
fflush(stdout);

gostcrypt(plain, cipher, key);
for (j = 0; j < 99; j++)
gostcrypt(cipher, cipher, key);
for (j = 0; j < 100; j++)
gostdecrypt(cipher, cipher, key);

if (plain[0] != cipher[0] || plain[1] != cipher[1]) {
fprintf(stderr, "\nError! i = %d\n", i);
return 1;
}
}
printf("All tests passed.\n");
return 0;
}
#endif /* TEST */

```