# SIMULATED ANNEALING

## This algorithm may be one of the best solutions to the problem of combinatorial optimization

### Michael P. McLaughlin

In science and engineering, it is a truism that, in any real system, everything is connected. Indeed, you could do far worse than to define science as the search for such connections. Scientists accorded the highest esteem, those names nearly everyone knows, are generally individuals whose insight allowed them either to forge links between a large number of separate concepts or to find a connection that no one had suspected.

Occasionally, something similar happens on a smaller scale. This article concerns the application of statistical mechanics to the solution of a difficult problem in computer science. The problem is combinatorial optimization; the technique is called "simulated annealing."

Combinatorial optimization is the task of taking a finite number of indivisible "objects" and arranging them in a configuration that is "best" according to some stipulated criteria. For instance, how should 15 swimmers be selected from a team of 30, for an upcoming meet, if no one is allowed to enter more than one event? Or how should aircraft, preparing to land, be sequenced in order to minimize delays? What is the best routing for a telephone call from Boston to San Francisco at any given moment? What is the best parse for a given English-language sentence? What these and similar problems have in common are 1. an extremely large number of discrete choices, 2. a complicated set of optimizing criteria and 3. numerous constraints. In such circumstances, simulated annealing has proven to be an excellent tool for seeking out the best solution.

Simulated annealing is a relatively new form of stochastic search, familiar primarily to VLSI chip designers. As you can imagine, determining the best geometrical arrangement for thousands of circuits is a formidable task. To take an oversimplified example, suppose you have to partition 1000 circuits into five regions on a microchip. Further suppose that permutations within a region do not matter and that every chip configuration can be given a numerical score ased upon an appropriate set of (possibly conflicting) criteria. Prior to assessing constraints on placement, you have five choices for each circuit for a total of 51000 (about 10699) configurations.

With numbers such as these, it really doesn't matter what kind of computer is available; checking each possibility is out of the question. Nevertheless, the task remains and neither guessing nor giving up are viable options. An answer is usually found with the help of a little computer assistance, often using simulated annealing. Before looking at the usefulness of this simulated annealing, however, let's examine its main competitor.

### The Greedy Algorithm

Perhaps the simplest general method for trying to locate a global optimum (maximum or minimum) in a discrete space is a technique often referred to as the "greedy" algorithm. Described in English, it goes something like this:

Until you try a few experiments with this algorithm on large problems with known answers, there is a strong temptation to believe that it might actually work. However, unless the size of the problem (the number of elements to be configured) is small or the configuration space exceptionally smooth, the chances of finding a global optimum in this fashion are remote.

The deficiencies of the greedy algorithm are not obscure. Imagine a mountain goat traipsing about the Rocky Mountains looking for the deepest valley (where the best grass is). This goat is very hungry (greedy?) and, consequently, never takes an upward step. Unless the goat starts out on a smooth slope of the deepest valley, it will never reach that valley but will, instead, come to a dead end in some higher valley, that is, in a local minimum, which may be much higher than the global minimum. Moreover, a space as rough as the Rockies contains thousands of valleys, some only a few feet deep, and repeating the exercise (step 5, above) may yield improvement but probably not the best result.

### A Great Deal of Difficulty

The difficulties of combinatorial optimization can be illustrated in the game of poker solitaire where the object is to rearrange the 25 cards of the tableau, like that in Figure 1, so that the 12 hands of straight poker formed from the 5 rows, 5 columns, and 2 diagonals give the highest total score. Various scoring schemes are popular, but I shall use the one listed in Table 1 in which each hand of a pair or better has a value inversely proportional to its probability in a standard deck, normalized to that of a single pair. All hands of a given type (for example, full house) are considered equal, and an ace can be high or low.

#### Table 1: Scoring table

```  Hand                  Score
---------------------------

Straight flush       27,456

Four of a kind        1,760

Full house              293

Flush                   215

Straight                108

Three of a kind          20

Two pair                  9

Pair                      1```

Compared to the task of optimizing circuit placement on a microchip, this is a small problem indeed. There are only 25! (about 1025) configurations and even fewer if symmetry (rotations, reflections, and so on) is taken into account. Nevertheless, this is such a huge number that, even if a computer could evaluate a million configurations per second without repetition, the execution time required to test all of them would be comparable to the present age of the universe. To accommodate this situation, both simulated annealing and the greedy algorithm employ stochastic techniques, evaluating only a fraction of the possible configurations. Consequently, these algorithms provide no absolute guarantee that they will find the global optimum every time. For large problems, it is usually sufficient that a technique provides a reasonable chance of coming close to the global optimum nearly every time. Then, a few runs will yield as good an answer as you are likely to require.

The tableau in Figure 1 contains 7 pairs for a score of 7, although there are obviously 4 aces and 4 kings, so it is reasonable to expect that a good score would be at least 4000. Finding a really high score is not easy, however. The 12 poker hands are tightly interlocked, and even a small change in configuration can result in a drastic change in score. This is another way of saying that the configuration space is very rough. Informal experiments have shown it to be more than a match for a human player, and it provides a good test for any combinatorial optimization technique.

### Simulated Annealing

The term annealing originally referred to a process employed in the fabrication of objects constructed of metal or glass. When these materials are shaped, small, often microscopic, regions of stress develop in response to deformations at the atomic level and cause the object to be prone to fracture. Annealing corrects this defect.

From a chemical standpoint, regions of stress have relatively high energy and thus are "almost fractured" already. An object would be more stable (of lower energy) if the stress were absent. However, atoms and molecules in a solid at room temperature do not have enough energy, on average, to move about and relieve the stress. (If they did, the object wouldn't be very solid.) When an object is annealed, it is first heated enough to provide the constituent atoms with sufficient energy to relax any stress but not enough to initiate melting. It is then cooled very slowly. During the cooling phase, the atoms are gradually "frozen" in place. If the annealing is done properly, the resulting object will be without stress. If the cooling is too rapid, there will be insufficient time for the atomic structure to relax completely, and the object will end up with excess energy and fracture too easily.

In simulated annealing, score is associated with energy such that minimizing the energy means optimizing the score. In the poker solitaire example, high score signifies low energy. The tableau in Figure 1 can be thought of as a collection of "atoms" frozen into a configuration that has too much energy. Proceeding with this analogy, the tableau may be subjected to simulated annealing by raising its "temperature" until it is almost "melted," then cooling it slowly until it is again "frozen." Although the connection between real and simulated annealing appears to be little more than a metaphor, this system (25 playing cards) responds in the same way as does a physical system -- and with the same results.

In any physical system, the average </entry> energy, <E>, of the constituent atoms is a function of the absolute temperature, T. As the temperature of a system decreases, the average energy of its atoms also decreases. A typical cooling curve is shown in Figure 2. The slope of this curve, d<E>/dT, is the heat capacity, C, of the system. In regions where the curve is steep, the heat capacity is high. This usually indicates a phase change -- water into ice, for example. Quite often, these regions also display large values of C/T. This quantity is equal to dS/dT, the rate of change of entropy, S, with temperature, which is related to a change in the orderliness of the atoms. Therefore, regions of a cooling curve that are steep usually imply a significant reconfiguration of the atoms (or molecules) at the corresponding temperatures. In an annealing process where temperature is being regulated, you would have to slow down the cooling in such regions in order to allow the atoms time to rearrange themselves and to permit their average energy to decrease gradually. Annealing works best when this is done.

Simulated annealing also produces cooling curves much like that of Figure 2. In fact, the literature of simulated annealing demonstrates that these analogies run very deep and it is not at all easy to decide just where to draw the line between metaphor and methodology.

### The Connection

The logical link between combinatorial optimization and annealing is the Boltzmann distribution, one of the basic laws of statistical mechanics. Up to now, I've referred to the average energy of a collection of atoms. In most large populations, there is significant variation from individual to individual. So it is with atoms and molecules. They are constantly moving, and their mutual collisions result in continual transfers of energy from one atom (or molecule) to another. Much of the science of statistical mechanics involves techniques for extracting macroscopic properties of a system from the collective behavior of enormous numbers of atoms and molecules. This information may be obtained, to a large extent, by determining how energy is partitioned among the atoms and molecules making up the sample.

At any temperature, there are generally more atoms at lower energies than at higher energies. The Boltzmann distribution, (see Example 1, equation 1) describes the relative number of atoms (or molecules) populating any two energy states, Ehi > Elo, at some temperature, T, where Nx is the number of atoms with energy Ex, T is the absolute temperature (degrees Kelvin), and k is Boltzmann's constant. When equation 1 is adapted to the problem of combinatorial optimization, the constant is ignored, energy is equated to score (cost), and the left-hand side of equation 1 is interpreted as a probability. Simulated annealing uses such probabilities to direct a stochastic search of the configuration space by employing equation 2, Example 1 where Sx is a score and T is a scaling factor.

#### Example 1: The Boltzmann distribution in chemistry and computer science

```  Equation 1: N hi/N lo = exp ((E lo-E hi)/kT)
Equation 2: Prob (accept worse score) = exp ((S lo-S hi)/T)```

In other words, you construct a new configuration at random and determine its score. If this score is at least as good as that of the current configuration, then the old configuration is forgotten and the new configuration becomes the current configuration, as in the greedy algorithm. If the new score is worse (here, lower), however, then equation 2 is used to generate the probability of accepting the new configuration anyhow. A uniform random number in the interval (0,1) is then generated, and if it is less than the probability computed with equation 2, the new configuration is made the current configuration. The net effect is to allow the system to extricate itself from local optima (that is, to let the goat climb uphill sometimes).

As equation 2 suggests, choosing a worse configuration is a function of temperature. The higher the temperature, the higher the "energy" of the system. The higher the energy of the system, the more readily it will do the work of moving in an unfavorable direction -- just like real atoms and molecules. As the temperature decreases, however, any given expenditure of energy becomes less likely. Consequently, simulated annealing automatically adapts to the magnitude of the features of a good configuration. The most profitable features are found first and the smaller, more subtle, features later. Starting with the tableau in Figure 1, simulated annealing first finds the four kings and four aces. If the temperature is very high, these features may be forgotten from time to time, but as the temperature decreases, eventually they are frozen into every acceptable tableau. As the temperature continues to fall, hands of lower and lower scores are successively frozen until, finally, the entire tableau is frozen. Simulated annealing is terminated at this point.

It is clear that the annealing schedule -- the program for reducing the temperature -- is critical; a large drop in temperature at the wrong time may freeze in undesirable features. (Ideally, all reconfigurations should be reversible.) The annealing schedule answers the following four questions:

1. What should be the value of the starting temperature?

2. When should the temperature be changed?

3. How should the temperature be changed?

4. When should the search be stopped?

A quick-and-dirty procedure is to pick a starting temperature high enough so that virtually all scores are deemed acceptable, hold that and each successive temperature for (100*SIZE) reconfigurations or (10*SIZE) successful reconfigurations, whichever comes first, then decrease the temperature by a constant factor -- for example, T(new) = 0.9*T(old) -- stopping when T reaches some very low value (say, 0.1). Such a schedule is often employed for an exploratory run on a new problem in order to determine the shape of the cooling curve. Given the cooling curve, you can then create a proper annealing schedule by arranging for the temperature to decrease most slowly where the curve is steepest.

### Implementation Details

The ANNEAL.C program (see Listing One) addresses both the poker solitaire problem and computes the average and standard deviation of the scores accepted at any temperature. These statistics help in answering the questions posed above.

To choose a starting temperature, begin by picking one that seems high relative to the scores expected and run the program for that temperature only. There should be very few reconfigurations rejected. After equilibrium is reached at this temperature (see below), statistics will be output, one of which is SIGMA, the standard deviation of the acceptable scores. As the temperature approaches infinity, SIGMA approaches SIGMA(inf), the standard deviation of a large set of random configurations. Set the starting temperature equal to 20*SIGMA(inf).

There is no point in trying further reconfigurations at a given temperature once a system has reached thermal equilibrium. (The method used here for recognizing equilibrium is adapted from a procedure published by Huang, Romeo, and Sangiovanni-Vincentelli.)

ANNEAL.C keeps track of the number of reconfigurations accepted (nsuccesses) and rejected (nfailures) at a given temperature as well as the number of acceptable scores less than or more than SIGMA/2 from the average acceptable score (incount and outcount, respectively). The code in Listing Two is then used to decide whether or not equilibrium has been reached. (Uppercase variables are parameters.)

Once equilibrium has been established, the temperature is decreased, provided that the current configuration has a score no worse than SIGMA/2 from the average score. The temperature is not changed until this is true. The idea is to avoid transmitting a poor outlier to a new, lower temperature. As shown, there is no explanation for delaying the transition to a lower temperature.

The simplest and most flexible method for computing a new temperature is to use a table, matched to the cooling curve, listing the desired sequence of temperature ratios, T(new)/T(old), and the temperatures at which they take effect. Typically, these ratios fall in the range (0.5, 1.0). ANNEAL.C permits up to ten values for these ratios in addition to the initial value. Table 2 gives the actual ratios used in this example.</entry>

#### Table 2: Temperature ratios

```  Temperature       t_ratio
Less Than
-------------------------

--             0.7
360             0.8
215             0.7
85             0.8
60             0.9
30             0.95
15             0.9
7             0.8
3             0.7
0             0.7```

The configuration is considered frozen if, at equilibrium, the largest single change in score observed at the current temperature is equal to the difference between the best and worst scores at this temperature. This is taken to indicate that all accessible scores are of comparable value and there is no further need for simulated annealing. In ANNEAL.C, this test is carried out only after the temperature reaches a preset limit, t_low. As a failsafe, another parameter, t_min, is stipulated as the temperature at which the configuration is always assumed to be frozen. Once the configuration is frozen, it is "quenched" by setting the temperature equal to zero (greedy algorithm).

In the interest of expediency, the best configuration found is always remembered separately from the current configuration. Ideally, the final configuration should be the best. This is often not the case, however. At the end of the program, this best-found configuration is also quenched.

### Giant Steps and Baby Steps

Although a proper annealing schedule is crucial to the success of simulated annealing, there is another consideration that is nearly as important --namely, the manner in which you make a random change to the current configuration. This is true for simulated annealing and, to a lesser extent, for the greedy algorithm. To find a global optimum, any search technique must be able to access the entire configuration space. To do this effectively requires both "giant steps" and "baby steps," the former for moving quickly to a distant part of the space and the latter for fine-tuning.

In this example, baby steps are easy -- just interchange two cards. This is the smallest reconfiguration possible. Finding good giant steps is much more difficult. Of course, you could simply execute several baby steps at once but experience shows that this strategy yields poor results. One reason is that, in making a giant step, it is essential to retain as much of the current score as possible. At moderate to low temperatures, any arbitrary giant step would almost certainly cause a substantial worsening of the score and the new configuration would simply be rejected. All rejected moves are wasted and an efficient algorithm must try to keep such moves to a minimum.

In ANNEAL.C, a giant step consists of interchanging 2 of the 12 hands of the tableau, selected at random from a uniform distribution. A baby step, on the other hand, is implemented using a "rotation." A random integer in the range [2,5] is chosen from an exponential distribution (see Table 3). This number of cards is then selected from the tableau using a uniform distribution. The first card selected is temporarily set aside and the resulting hole filled with card number 2, and so on. The last hole is filled with card number 1. The ratio of giant steps to baby steps is a parameter, exg_cutoff, here set equal to 0.2. Using only giant steps or baby steps generally gives poor performance.</entry>

#### Table 3: Rotation distribution

```    # of Cards      Probability
to Rotate
-----------------------------

2             0.33333
3             0.27018
4             0.21899
5             0.17750```

### Experimental Results

Simulated annealing and the greedy algorithm are both stochastic techniques. Therefore, they do not give the same results every time. The proper way to evaluate a stochastic technique is to apply it several times and determine its average behavior (actually, its behavior distribution). One of the reasons such a small example was chosen for this article was precisely to allow such a test to be done in a reasonable amount of time.

The simulated annealing algorithm was carried out 30 times, starting with the tableau in Figure 1 each time, using a random-number generator with a period much longer than that required by the experiment. The greedy algorithm was then applied to the same tableau for the same amount of CPU time. Apart from the rule for accepting or rejecting a new configuration, the two programs are essentially identical. In particular, all the relevant parameter values are the same (see A.DAT, Listing Three). The best result from either technique is a score of 4516 (see Figure 3). This apparent optimum, found once by the greedy algorithm and twice by simulated annealing, is degenerate (that is, there are several tableaux with this score, which are not symmetry-related). The distribution of results for simulated annealing is shown in Figure 4 and that for the greedy algorithm in Figure 5.

The number of iterations used to generate Figure 4 and Figure 5 should be enough to give an approximate indication of the average performance of the two techniques. No serious attempt was made to find the combination of parameters that would optimize the simulated annealing results because this would have constituted a bias against the greedy algorithm and such a combination would have applied too specifically to this particular initial tableau. The parameter values in A.DAT appear to be adequate for the poker solitaire problem in general although the annealing schedule will have to be altered when the cards are changed, especially if the new initial tableau can produce a straight flush (see accompanying box). For other combinatorial optimization problems, both the parameter values and the annealing schedule will have to be changed. The big questions, of course, are, "Does simulated annealing work?" and "Is it worth the effort?" Taken together, the results shown in Figure 4 and Figure 5 suggest that simulated annealing is definitely worth the effort. Although you could argue that, given equal amounts of computation time, both techniques find what appears to be the global optimum, this would be overly simplistic. For stochastic techniques, the important criterion is their average behavior.</entry>

For this example, the probability that any given iteration of the greedy algorithm yields a score greater than or equal to the average simulated annealing result is 6/1583. Thus, 183 iterations of the greedy algorithm afford a 50 percent chance of doing at least as well as simulated annealing (607 iterations for a 90 percent chance). The greedy algorithm, however, converges only 53 times faster in this example than does simulated annealing. Clearly, there is a higher payoff with simulated annealing.

### Final Thoughts

Although I've compared simulated annealing to the well-known greedy algorithm, it would be incorrect to suggest that the latter is the only alternative. Recently, neural networks and various genetic algorithms have been applied to combinatorial optimization problems with some success. If the problems are small enough, then classical integer programming techniques (that is, branch-and-bound) may also prove fruitful. Moreover, when evaluating the material presented here, there are several points to bear in mind. First, not every application of simulated annealing will exhibit better performance than a greedy algorithm to the extent observed in this case, so you should not extrapolate too much from this single example.

Another important consideration is that simulated annealing is very time-consuming, and almost any serious application of this sort is best written in assembly language. The program ANNEAL.C was coded in C solely in the interests of pedagogy and portability.

Finally, both algorithms described in this article are stochastic in nature. Even though the tableau in Figure 1 was subjected to hundreds of simulated annealing runs, there is still no guarantee that the optimum found is, in fact, the global optimum. It's possible that there exists a tableau with a score greater than 4516 but that the path through the configuration space leading to it is so narrow that even the millions of reconfigurations carried out are insufficient to find it.

In this regard, it's interesting that all the tableaus found having scores of 4516 contained only 11 scoring hands, not 12. This being the case, there may be "room" for an even higher score. Better yet, what if a 13th hand were defined, consisting of the 4 corners plus the center card. Would there be any point in adding the extra degree of freedom? This intriguing thought is left as an exercise for DDJ readers.

### Bibliography

Huang, M. D.; Romeo, F.; and Sangiovanni-Vincentelli, S. "An Efficient Cooling Schedule for Simulated Annealing." Proc. Int. Conf. Computer-Aided Design (ICCAD86) (1986): 381 - 384.

Kirkpatrick, S.; Gelatt, C. D. Jr.; and Vecchi, M. P. "Optimization by Simulated Annealing." Science, vol. 220, no. 4598 (1983): 671 - 680.

Metropolis, N.; Rosenbluth, A;, Rosenbluth, M;, Teller, A;, and Teller, E. "Equations of State Calculations by Fast Computing Machines." J. Chemical Physics, vol. 21 (1953): 1087 - 1091.

Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. Numerical Recipes, The Art of Scientific Computing. New York: Cambridge University Press, 1986: 326 - 334.

Sechen, C. and Sangiovanni-Vincentelli, A. "TimberWolf 3.2: A New Standard Cell Placement and Global Routing Package." Proc. 23rd Design Automation Conf. (1986): 432 - 439.

### Availability

All source code for articles in this issue is available on a single disk. To order, send \$14.95 (Calif. residents add sales tax) to Dr. Dobb's Journal, 501 Galveston Dr., Redwood City, CA 94063, or call 800-356-2002 (from inside Calif.) or 800-533-4372 (from outside Calif.). Please specify the issue number and format (MS-DOS, Macintosh, Kaypro).</entry>

### ANNEAL.C Data Structures, Functions, and Variables

The playing cards are represented as 2-byte integers. The card values, 2 to Ace (high), are coded as the numbers 2 to 14 and the suits {spades, hearts, diamonds, clubs} are coded as {2048, 1024, 512, 256}, respectively. The tableau is intrinsically a two-dimensional array but is implemented as a one-dimensional array to avoid the computational burden of double indexing (see HAND[12][5] below). ANNEAL.C ignores the symmetry of the tableau with respect to the scoring function because it is, computationally, more trouble than it is worth.

rand1( ) -- takes a 31-bit integer, SEED, and returns another one. This generator yields a full-cycle sequence without repeating any numbers.

uni(z) -- returns a double-precision float in the range (O,z) via rand1( ).

initialize( ) -- reads in the starting data from A.DAT and sets up the data structures, and so on.

init( ) -- resets a number of global counters and variables, chiefly to aid in the collection of the statistics necessary to detect equilibrium.

get_new_temperature( ) --computes a temperature ratio, if necessary, and a new temperature (see Table 2).

check_equilibrium( ) -- uses the code given in the article to detect the equilibrium point.

update( ) -- computes those statistics, which need not be found until equilibrium has been achieved. (see SCALE).

selectwr( ) -- is a general routine to perform N selections without replacement.

reconfigure(direction) -- makes a new configuration, or restores the tableau to its previous configuration if the last move was rejected. (Typically, more than 90 percent of the reconfigurations are rejected.) This routine uses giant steps or baby steps.

get_score(crds) -- is called by evaluate1( ) and returns the score of a single hand as defined in HAND[12][5].

evaluate1( ) -- passes each hand, one by one, to get_score(crds) and tallies up the total SCOR (which becomes the current SCORE, if the reconfiguration is accepted). Along with rand1( ) and reconfigure( ), the functions evaluate1( ) and get_score( ) are all written in MC68020 assembly language in the working code. This code runs at a speed of approximately 2035 attempted reconfigurations/sec. on an AMIGA 1000 equipped with a 68020/68881 chip set, plus 32-bit RAM (a standard Amiga, executing ANNEAL.C, does 70/sec.), and finishes an average simulated annealing run in less than 3 minutes with the results as shown in Figure 4. Without this sort of speed, the long series of experiments needed to obtain good parameter values, and to compare with the greedy algorithm, would have required more patience than this author has ever possessed.

decide( ) -- answers the question "To accept or not to accept?" and uses the formula shown Example 1, equation 2. It also accumulates statistics like INCOUNT and OUTCOUNT that must be computed at every move.

report( ) -- produces the output statistics and the final tableaux.

try_new_temperature( ) -- is a calling function that uses a newly computed temperature. It also checks that the tableau passed on to a new temperature is not a low outlier.

main( ) -- q.v. HAND[12][5] -- stipulates the 12 hands (5 cards each) in the TABLEAU and describes the various hands in terms of the appropriate indices of TABLEAU.

PARTITION[6], OVERLAP[66], HEX [12], CINDEX[25] -- are used to generate a new configuration. PARTITION implements Table 3 in long integers (pre-multiplied by MODULUS). OVERLAP is a triangular array giving the index, in TABLEAU, of the intersection of two HANDs (-1 if no overlap). HINDEX and CINDEX are arrays used to perform selection without replacement and to remember those selections should the new configuration be rejected.

SCORES[18] -- is an implementation of Table 1. After evaluate1( ) has determined the index to SCORES, SCOR is assigned the corresponding value.

TEMPERATURE, T_LOW, T_MIN --is the scaling factor. T_LOW is the temperature below, which the tableau can be tested to see if it is FROZEN or not. If the temperature ever goes below T_MIN, the tableau is automatically FROZEN.

AVG_SCORE, SIGMA -- are the average and standard deviation of the scores of all the tableaux accepted at the current temperature.

SCALE -- is a typical SCORE when the temperature is getting cold. This variable has no function in the algorithm per se, but is incorporated so that SIGMA may be computed using single-precision arithmetic. (This is especially important in the absence of a floating-point chip.) By subtracting off a good score before accumulating the square-of-the-score, the number of digits to be squared is reduced. If this is not done, then a single-precision number will lose too much precision at a time in the annealing process when the average-of-the-squares and the square-of-the-average are nearly equal. When this happens, it is possible to get SIGMA = 0 erroneously. This has a deleterious effect on the program.

T_RATIO -- T(new)/T(old). The initial value (here, 0.7) and 10 other values are listed in A.DAT.

EXG_CUTOFF -- is the ratio Prob (giant step)/Prob(baby step).

SUCC_MIN, INCOUNT_LIMIT, OUT-COUNT_LIMIT, FIRST_LIMIT, ULTIMATE_LIMIT -- are parameters used to determine the equilibrium point.

-- M.M.

### Cellular Automata: A New Way of Simulation

`Jean F. Coppola and Dr. Francis T. Marchese`

Jean Coppola is a master's degree student in computer science and Dr. Francis Marchese is an associate professor in the computer science department at Pace University. They can be reached at Pace University, 1 Pace Plaza, New York, NY 10038.

In the annealing process, glass, certain metals, and alloys are heated and cooled in a controlled way to allow the atoms to arrange themselves into a stable state. Normally, the process of annealing is simulated by means of a set of complex differential equations. Alternatively, annealing can be simulated by an intuitively simpler method known as "cellular automata."

Cellular automata are discrete dynamical systems in which the behavior of the system as a whole is defined in terms of a set of local rules. The simplest and best-known cellular automata is the game of LIFE conceived by John Conway in the early 1970s. LIFE is a computer simulation that consists of a grid of cells upon which a collection of creatures are born, live and die. The life/death process is governed by a neighborhood rule that defines the behavior of any creature at any cell located in terms of the behavior of its neighbors. This is a significant concept because it implies that for collective systems, no member of that collection of entities (or creatures) can exist independently of the rest. Moreover, the rules that govern the system as a whole are only the rules of the local neighborhoods. Hence, for example, in a two-dimensional life game consisting of a 100 x 150 grid, the rules are only defined for each local 3 x 3 matrix! Even though this is the case, the complexity of the life process produced can be quite profound.

As a result, cellular automata has been applied in areas such as physics, chemistry, and computer science. For example, particle dynamics, galactic evolution, molecular kinetics, or parallel distributed processing are all fields where local laws dictate global behavior. At Pace University, we've applied cellular automata to the real-time study of the dynamics of crystal growth. Because of the large number of local interactions that must be accommodated in these simulations, even a mainframe cannot provide the computing power required for real-time visualization. However, assisting in the simulations, is an IBM PC-compatible, high-performance hardware option called CAM-6 (Cellular Automata Machine) that runs under MS-DOS. CAM-6 is a special-purpose coprocessor designed specifically for cellular automata simulations by researchers at the MIT Laboratory for computer science. Its pipelined architecture allows software to run at speeds comparable to that of a CRAY-1, therefore allowing the CAM board to update and display a 256 x 256 array of cells 1/60 of a second, thus producing real-time animation. Each cell in CAM is represented by a pixel, and cells can assume 1 of 16 possible states signified by 16 different colors.

The CAM-6 machine is programmable in a high-level language called "CAM Forth," a special version of Forth-83. Because Forth was originally developed for real-time applications and meets the demands of cellular automata by its code efficiency.

We have used this combination of hardware and software to simulate crystal growth. Presently, we wish to discover the local rules which govern the growth of sodium chloride (common table salt). This is accomplished by programming neighborhoods that work with the four major neighbors on all the planes. The difficult part in programming "rules" is keeping in mind that one cannot change the state of a neighbor but only the cell currently in question. For example, you cannot indicate if a cell has three or more neighbors at state 1, and turn all neighbors at state 0 to 1. Therefore, programming has to be done around this constraint, which at times can be extremely difficult. We have also simulated complex solid liquid equilibria. This involved programming minor neighborhoods which allowed states to be compared from a second CAM machine that was programmed as a random noise generator.

By watching and studying these simulations, we hope to gain insight into the synthesis of crystals and fundamental patterns of molecular growth. The models of cellular automata could one day be a major breakthrough in the understanding of nature's rules, which in turn could affect every aspect of life.

--J.C., F.T.M.

```

/* ANNEAL.C -- Author: Dr. Michael P. McLaughlin
*/
#include <stdio.h>
#include <math.h>

#define ABS(x) ((x<0)?(-(x)):(x))
#define BOOLEAN int
#define TRUE 1
#define FALSE 0
#define FORWARD 1
#define BACK 0

struct cardtype {
char card[3];
int code;
} cards[25];
float ratios[10][2];
int tableau[25],best_tableau[25];
float temperature,best_temperature,t_low,t_min,t_ratio;
long seed,score,best_score,tot_successes,tot_failures,scor,exg_cutoff,
report_time,report_interval,modulus = 2147483399,step = 1;
int next;
BOOLEAN equilibrium,frozen = FALSE,quench=FALSE,final=FALSE;

/* variables needed to assess equilibrium */
float totscore,totscore2,avg_score,sigma,half_sigma,score_limit;
long bscore,wscore,max_change,nsuccesses,nfailures,scale,
incount,outcount;
int incount_limit,outcount_limit,succ_min,first_limit,ultimate_limit;

/* poker hands in tableau */
int hand[12][5] = {0,1,2,3,4,      /* rows */
5,6,7,8,9,
10,11,12,13,14,
15,16,17,18,19,
20,21,22,23,24,
0,5,10,15,20,   /* columns */
1,6,11,16,21,
2,7,12,17,22,
3,8,13,18,23,
4,9,14,19,24,
0,6,12,18,24,   /* diagonals */
4,8,12,16,20};
/* 0,4,12,20,24     = corners */

long rand1()
/* Get uniform 31-bit integer -- Ref. CACM, June 1988, pg. 742 */
{
register long k;
k = seed/52774;
seed = 40692*(seed-k*52774)-k*3791;
if (seed < 0)
seed += modulus;
return(seed);
}
double uni(z)
int z;
{
return((double) z*rand1()/modulus);
}
void initialize()
/* Set up entire simulated annealing run. */
{
char label[20];
double exgc;
register i;
FILE *fp;
fp = fopen("a.dat","r");
fscanf(fp,"%f %s\n",&temperature,label);
fscanf(fp,"%f %s\n",&t_low,label);
fscanf(fp,"%f %s\n",&t_min,label);
fscanf(fp,"%f %s\n",&avg_score,label);
fscanf(fp,"%ld %s\n",&scale,label);
fscanf(fp,"%f %s\n",&t_ratio,label);
fscanf(fp,"%f %s\n",&sigma,label);
fscanf(fp,"%lf %s\n",&exgc,label);
fscanf(fp,"%d %s\n",&succ_min,label);
fscanf(fp,"%d %s\n",&incount_limit,label);
fscanf(fp,"%d %s\n",&outcount_limit,label);
fscanf(fp,"%d %s\n",&first_limit,label);
fscanf(fp,"%d %s\n",&ultimate_limit,label);
fscanf(fp,"%ld %s\n",&report_interval,label);
fscanf(fp,"%ld %s\n",&seed,label);
half_sigma = 0.5*sigma; report_time = report_interval;
score_limit = 20; exg_cutoff = modulus*exgc;
for (i=0;i<10;i++)
fscanf(fp,"%f %f\n",&ratios[i][0],&ratios[i][1]);
for (i=0;i<25;i++) {
fscanf(fp,"%d %s\n",&cards[i].code,cards[i].card);
tableau[i] = cards[i].code;
}
fclose(fp);
printf("                   TOTAL     TOTAL  CURRENT AVERAGE\n");
printf("STEP TEMPERATURE SUCCESSES FAILURES  SCORE   SCORE ");
printf(" SIGMA\n\n");
}
void init()
/* set up for new temperature */
{
nsuccesses = 0; nfailures = 0; equilibrium = FALSE; incount = 0;
outcount = 0; bscore = 0; wscore = 100000; max_change = 0;
totscore = 0; totscore2 = 0;
if (temperature < t_min)
frozen = TRUE;
}
void get_new_temperature()
{
if (temperature < ratios[next][0]) {
t_ratio = ratios[next][1];
next++;
}
temperature = t_ratio*temperature;
}
void check_equilibrium()
/* Determine whether equilibrium has been reached. */
{
if ((nsuccesses+nfailures) > ultimate_limit)
equilibrium = TRUE;
else if (nsuccesses >= succ_min) {
if (incount > incount_limit)
equilibrium = TRUE;
else {
if (outcount > outcount_limit) {
if (nsuccesses > first_limit)
equilibrium = TRUE;
else {
incount = 0;
outcount = 0;
}
}
}
}
}
void update()
/* Compute statistics, etc. */
{
float ascore,s;
if (nsuccesses > 0) {
ascore = totscore/nsuccesses;
avg_score = ascore+scale;
s = totscore2/nsuccesses-ascore*ascore;
if (s > 0.0) {
sigma = sqrt(s); half_sigma = 0.5*sigma;
score_limit = avg_score-half_sigma;
}
}
if ((temperature < t_low)&&((bscore-wscore) == max_change))
frozen = TRUE;
}
void selectwr(array,mode,nchoices)
/* Select from array without replacement. */
int *array,mode,nchoices;
{
int i,temp,size,index;
size = mode==1 ? 12 : 25;
for (i=0;i<nchoices;i++) {
index = uni(size);
temp = array[--size];
array[size] = array[index];
array[index] = temp;
}
}
void reconfigure(direction)
/* If direction is FORWARD, make a new move; if it is BACK, restore
the tableau to its previous configuration. */
int direction;
{
static long partition[6] = {0,0,715827799,1296031795,1766307855,
2147483400};
static int hindex[12] = {0,1,2,3,4,5,6,7,8,9,10,11};
static int cindex[25] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,
14,15,16,17,18,19,20,21,22,23,24};
static int overlap[66] = {-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,0,5,10,15,
20,1,6,11,16,21,-1,2,7,12,17,22,-1,-1,3,
8,13,18,23,-1,-1,-1,4,9,14,19,24,-1,-1,-1,
-1,0,6,12,18,24,0,6,12,18,24,4,8,12,16,20,
20,16,12,8,4,12};
static int mode,nchoices,common,last;   /* remember changes */
int temp,hi,lo,i,j,k;
if (direction == FORWARD) {
if (rand1() < exg_cutoff) {   /* giant step */
mode = 1; nchoices = 2;
selectwr(hindex,mode,nchoices);
hi = hindex[11]; lo = hindex[10];   /* order hand indices */
if (hi < lo) {
hi = lo;
lo = hindex[11];
}
common = overlap[hi*(hi-1)/2+lo];   /* triangular matrix */
}
else {   /* baby step */
mode = 0; nchoices = 2;
rand1();   /* How many cards to rotate? */
while (seed > partition[nchoices])
nchoices++;
selectwr(cindex,mode,nchoices);
temp = tableau[cindex[24]];   /* rotate forward */
for (i=1,j=24;i<nchoices;i++,j--)
tableau[cindex[j]] = tableau[cindex[j-1]];
tableau[cindex[j]] = temp;
last = j;
}
}
if (mode == 1) {   /* swapping hands is symmetrical */
register first,second,c;
first = hindex[11]; second = hindex[10];
k = (common<0) ? 5 : 4;
for (c=0,i=0,j=0;c<k;c++,i++,j++) {
if (hand[first][i] == common)
i++;
if (hand[second][j] == common)
j++;
temp = tableau[hand[first][i]];
tableau[hand[first][i]] = tableau[hand[second][j]];
tableau[hand[second][j]] = temp;
}
}
else if (direction == BACK) {   /* rotation is not */
temp = tableau[cindex[last]];
for (i=1,j=last;i<nchoices;i++,j++)
tableau[cindex[j]] = tableau[cindex[j+1]];
tableau[cindex[24]] = temp;
}
}
long get_score(crds)
/* Return score of one hand. */
int *crds;
{
static long scores[18] = {0,1,1,20,1,9,20,1760,1,9,9,
293,20,293,1760,108,215,27456};
int flush,i,index = 0;
flush = crds[0]&crds[1]&crds[2]&crds[3]&crds[4]&0xff00;
for (i=0;i<5;i++)
crds[i] = crds[i]&0xff;   /* ignore suits */
{
register b,k,t;   /* sort cards from high to low */
for (b=4;b>0;b--)
for (k=0;k<b;k++)
if (crds[k] < crds[k+1]) {
t = crds[k];
crds[k] = crds[k+1];
crds[k+1] = t;
}
}
if (!flush) {   /* multiple cards ? */
if (crds[0] == crds[1]) index += 8;
if (crds[1] == crds[2]) index += 4;
if (crds[2] == crds[3]) index += 2;
if (crds[3] == crds[4]) index += 1;
}
if (!index) {   /* straight ? */
if (((crds[0]-crds[4]) == 4)||((crds[0] == 14)&&(crds[1] == 5)))
index = 15;
if (flush)
index = (index == 15) ? 17 : 16;
}
return(scores[index]);
}
long evaluate1()
{
int temp[5],h,c;
long scr = 0;
for (h=0;h<12;h++) {   /* h<13 for corners */
for (c=0;c<5;c++)
temp[c] = tableau[hand[h][c]];
scr += get_score(temp);
}
return(scr);
}
void decide()
/* Either accept new configuration or restore old configuration */
{
float pdt1,pdt2,sscor;
long s;
BOOLEAN acceptable = FALSE;
scor = evaluate1();
if (scor >= score) {
if (scor > best_score) {
register i;
best_score = scor;
best_temperature = temperature;
for (i=0;i<25;i++)
best_tableau[i] = tableau[i];
}
acceptable = TRUE;
}
else if (temperature > 0.0) {   /* uphill movement? */
if (uni(1) < exp((scor-score)/temperature))   /* Equation 2 */
acceptable = TRUE;
}
if (acceptable) {   /* statistics, etc. */
s = ABS(score-scor);
if (s > max_change)
max_change = s;
score = scor;
sscor = scor-scale;   /* to aid precision of sigma */
totscore += sscor;
totscore2 += sscor*sscor;
nsuccesses++;
if (ABS(avg_score-score) < half_sigma)
incount++;
else
outcount++;
if (score > bscore)   /* maximization */
bscore = score;
else if (score < wscore)
wscore = score;
}
else {   /* unacceptable */
reconfigure(BACK);
nfailures++;
}
}
void report()
{
tot_successes += nsuccesses; tot_failures += nfailures;
printf("%3ld %10.1f %9ld %9ld %7ld %7.0f  %5.1f\n",step,temperature,
tot_successes,tot_failures,score,avg_score,sigma);
report_time += report_interval;
if (frozen) {
int temp[25],k,kk;
register i,j;
BOOLEAN ok;
if (final)
printf("\nFINAL -- ");
else if (quench)
printf("\nQUENCH -- ");
else
printf("\nFROZEN -- ");
printf("BEST SCORE = %5ld  BEST TEMPERATURE = %5.1f\n\n",
best_score,best_temperature);
for (i=0;i<25;i++) {
k = best_tableau[i];
j = 0;
ok = FALSE;
while (!ok) {
if (cards[j].code == k) {
temp[i] = j;
ok = TRUE;
}
else j++;
}
}
for (i=0;i<25;i+=5) {
for (j=i;j<(i+5);j++) {
k = 4-strlen(cards[temp[j]].card);
for (kk=0;kk<k;kk++) printf(" ");
printf("%s",cards[temp[j]].card);
}
printf("\n");
}
printf("\n");
}
}
void try_new_temperature()
{
init();
while (!equilibrium) {
reconfigure(FORWARD);
decide();
check_equilibrium();
}
update();
if (!quench) {   /* ratchet */
while ((float) score < score_limit) {   /* maximization */
reconfigure(FORWARD);
decide();
}
}
}
main()
{
initialize();
while (!frozen) {
try_new_temperature();
update();
if ((step == report_time)||(frozen))
report();
step++;
if (temperature > 0.0)
get_new_temperature();
}
/* Quench frozen configuration. */
temperature = 0;
quench = TRUE;
try_new_temperature();
update();
report();
step++;
/* Quench best configuration (rarely useful). */
{
register i;
final = TRUE;
for (i=0;i<25;i++)
tableau[i] = best_tableau[i];
}
try_new_temperature();
update();
report();
printf("SEED = %ld\n",seed);   /* seed of next run */
}

```
[LISTING TWO]
```

if ((nsuccesses+nfailures) > ULTIMATE_LIMIT)
equilibrium = true;
else if (nsuccesses >= SUCC_MIN) {
if (incount > INCOUNT_LIMIT)
equilibrium = true;
else {
if (outcount > OUTCOUNT_LIMIT) {
if (nsuccesses > FIRST_LIMIT)
equilibrium = true;
else {
incount = 0;
outcount = 0;
}
}
}
}

```
[LISTING THREE]
```

2150 temperature
1.5 t_low
0.1 t_min
40 avg_score
4400 scale
0.7 t_ratio
150 sigma
0.2 exg_cutoff
200 succ_min
250 incount_limit
400 outcount_limit
2700 first_limit
10000 ultimate_limit
1 report_interval
1234567890 seed
360 0.8
215 0.7
85 0.8
60 0.9
30 0.95
15 0.9
7 0.8
3 0.7
0 0.7
0 0.7
1032 8H
2061 KS
270 AC
522 10D
1029 5H
265 9C
1035 JH
1037 KH
525 KD
515 3D
263 7C
2058 10S
2062 AS
518 6D
1036 QH
267 JC
259 3C
2053 5S
269 KC
520 8D
1028 4H
1038 AH
519 7D
1026 2H

[EXAMPLE 1]

Equation 1:  N /su/hi//N /su/lo/ = exp((E lo-E hi)/kT)

Equation 2:  Prob(accept worse score) = exp((S /su/lo/ -S /su/hi/)/T)

Example 1: Equations that describe the Boltzman distribution

```