A Multiple-Population Evolutionary Approach To Gate Matrix Layout

International Journal of Systems Science volume 35, number 1, 15 January 2004, pages 1323
A multiple-population evolutionary approach to gate matrix layout

A. MENDESy and A. LINHARESz*
This paper deals with a Very-Large-Scale Integrated systems design problem that belongs to the NP(Nondeterministic Polynomial)-hard class. The Gate Matrix Layout problem has numerous applications in the chip-manufacturing industry and in other industrial settings. A memetic algorithm is employed to solve a set of benchmark instances, and numerical comparisons with a highly competitive methoda microcanonical optimization approachare performed. Beyond the eectiveness of the method, shown by the results obtained for these instances, an additional goal of this work is to study how the performance of the algorithm is aected by the use of multiple populations and of dierent individual-migration policies between such populations. The results signal a strong performance improvement of multiple populations over single population approaches. Finally, the proposed algorithm presents several renements, like structured populations and a specially tailored local search.
1. Introduction With applications ranging from elds as distinct as fuzzy modelling (Xiong 2001), autonomous robot behaviour (Luk et al. 2001), back-propagation learning (Foo et al. 1999), and multicriteria optimization (Viennet et al. 1996), evolutionary methods have become an indispensable tool for systems scientists. In this arena, an interesting emerging issue is the use of multiple populations, which is gaining increased momentum from the conjunction of two technologies: On the hardware side, computer networks, multiprocessor computers and distributed processing systems (such as workstation clusters) are becoming increasingly widespread. Regarding the software issue, the introduction of PVM (Parallel Virtual Machine), and later MPI (Message-passing Interface) as well as web-enabled object-oriented languages (such as Java) have also had their role. As most evolutionary algorithms (EAs) are inherently parallel methods, the distribution of
Received 29 April 2002. Revised 12 September 2003. Accepted 3 December 2003. y School of Electrical Engineering and Computer Science, University of Newcastle, Callaghan, NSW, 2308, Australia. email: mendes@cs.newcastle.edu.au z EBAPE/FGV, Praia de Botafogo, 190, 22.250-900 Rio de JaneiroRJBrazil. * To whom correspondence should be addressed. e-mail: linhares@fgv.br
the tasks is relatively easy for most applications. The workload can be distributed at an individual or a population level, the nal choice depending on the complexity of the computations involved. In this work, we do not use parallel computers, or networks of workstations. The proposed algorithm runs in a sequential way on a single processor, but populations evolve separately, simulating the behaviour of a parallel environment. Species evolve in nature grouped in populations, with boundaries dened by specic features like distance or geographical barriers. The role of closed (or nearly closed) populations in evolution is extremely important. Consider, for instance, the Galapagos Islands, an example of notorious inspirational role for Darwins ideas when aboard the HMS Beagle (Darwin 1993). A set of islands separated by several kilometres of water can be colonized by a single species of birds. In the beginning of such colonization, all animals will share the same characteristics (and genetic pool), but as evolution takes place, the groups concentrated in each island will start to dierentiate by adapting themselves to the particular characteristics present in each island (Weiner 1995). This independent adaptation may lead eventually to the emergence of dierent species, after a sucient number of generations, given that very little or no migration exists between the islands. Even if the islands share the same characteristics, dif-
International Journal of Systems Science ISSN 00207721 print/ISSN 14645319 online 2004 Taylor & Francis Ltd http://www.tandf.co.uk/journals DOI: 10.1080/00207720310001657054
14
A. Mendes and A. Linhares wires) on a gate matrix layout circuit. Gates can be described as vertical wires holding transistors at specic positions with nets interconnecting all the distinct gates that share transistors at the same position. An instance can be represented as a 01 matrix, with g columns and n rows. A value of 1 in position (i, j) means that a transistor must be implemented at gate i and net j; 0 means that no such connection is required. Two characteristics are fundamental to the problem: rst, all transistors in the same net must be interconnected. Second, the sequence of the gates does not alter the underlying circuit logic and is, thus, amenable to optimization of the design. As long as a specic wire connects all transistors in the same net, the circuit logic is stable. However, dierent net wires cannot overlap, and should two dierent nets share the same gate, they must be implemented over two separate physical tracks, adding to the overall circuit size. This superposition of interconnections denes the number of tracks needed to build the circuit. The mathematical objective is to nd a permutation of the g columns so that the superposition of interconnections is minimal, thus minimizing the number of required physical tracks and the overall circuit area (the number of gates is xed, so the circuit area is proportional to the number of tracks). Numerous nets may share the same track, if there is no wire superposition between them. Figure 1 shows clockwise from top left an instance of the problem, described by the net-gate matrix. In this particular (identity) sequence of gates, four tracks are required: the reader may notice that gate 2 requires dierent tracks for nets 1, 4, and 5. Moreover, net 2 demands the fourth track because its wire is running through from gate 1 to gate 7. The following gate sequence changes this in a signicant manner, as nets 2 and 5 and nets 3 and 4 do not overlap and thus can share tracks. The layout then requires only three tracks.
ferent species might arise, due to the relative isolation and the genetic drift phenomenon (Weiner 1995). Clearly, if two similar populations are separated and submitted to equal conditions, due to the random nature of the processes involved in evolution, they may still follow dierent evolutionary paths and become dierent species after a large number of generations. Analogously, it is not unusual in EAs to see that if the same algorithm is run twice, it may generate dierent nal solutions. Usually, this is viewed as a setback, but it can be very useful when multiple populations are used. With several populations evolving in parallel, larger portions of the search space can be sampled, and any particularly important genetic information found can be spread among them through migration of individuals. This mechanism makes the parallel search potentially more powerful than when a single population is employed (Cantu-Paz 1999, 2000).
2. A VLSI optimization problem: Gate Matrix Layout The Gate Matrix Layout problem is an NP-hard problem (Lengauer 1990, Linhares et al. 1999, Nakatani et al. 1986) that arises in the context of the physical layout of Very-Large-Scale Integration (VLSI) systems. There are numerous stages in the design of VLSI systems, the last being the physical design stage, in which the underlying logic function has been previously transformed to a set of interconnected wires. The architecture known as gate matrices (Wing et al. 1985; Mohring 1990) is part of a set of problems in which the physical design demands a linear arrangement of gates to minimize the circuit area (and thus minimize production costs and maximize performance). Mathematically, it can be stated as: suppose that there are g gates (vertical wires) and n nets (horizontal
Figure 1.
Translation from a given instances solution into the real circuit.
A multiple-population evolutionary approach to gate matrix layout If, at each net and for each gate sequence, we change the values between the rightmost and leftmost from 0 to 1 then the column sum will give us the required number of tracks for each gate. The required number of tracks for the whole circuit will be given by the maximum column sum, and such is the function intended for minimization here. A nal note: the process of mapping the nets from a particular gate sequence to a specic track assignment is computable in polynomial time by the so-called left-edge algorithm (Hashimoto and Stevens 1971). In the example, the permutation of the columns is h 2-4-3-1-5-7-6 i. After the interconnection of all transistors, represented by the horizontal lines, we calculate the number of tracks needed to build each gate. This number is given by the sum of positions used in each column, and the number of tracks required to build the circuit is its maximum. In the example, this value is 3. The lower-left diagram shows the circuit layout after the grouping of the connections, and the use of only three tracks. More detailed information on this problem, including other industrial settings in which it arises, can be found in Linhares (1999), Linhares et al. (1999), Linhares and Yanasse (2002a), and Yanasse (1997). The reader should be aware that this is not just a regular NP-hard problem: it was in fact the rst problem identied as being xed parameter-tractable, and this result eventually led to the creation of a new, large, class of problems following under the label FPT, for xed parameter tractability (Downey and Fellows 1995, Fellows and Langston 1987). In the next section, we present a new memetic algorithm for the gate matrix layout problem.
15
were categorized as memetic algorithms (MAs) in 1989 (Moscato 1989, Moscato and Norman 1992), emanating from the term meme introduced by Dawkins (1976). The eld of cultural evolution was suggested as being more relevant, as a working metaphor, to understand the performance and nd inspiration sources to improve these new methods. Let us describe the main features present in the implemented MA in the following. 3.1. Population structure It is illustrative to show how some MAs resemble more the cooperative problem-solving techniques that can be found in some organizations. For instance, in our approach, we use a hierarchically structured population based on a complete ternary tree. In contrast with a non-structured population, the complete ternary tree can also be understood as a set of overlapping sub-populations (which we will refer to as clusters). The use of this population structure, together with a recombination scheme that selects parents always belonging to the same cluster, introduces a multiple-population character within each population (see Section 3.5). The choice of the ternary tree structure was based mainly on empirical aspects. The rst is motivated by the fact that any hierarchical tree behaves like a set of overlapping clusters, as said before. Therefore, the dynamics are similar to several populations evolving in paralleleach cluster acts as an independent population and exchanging individuals at a given rate. This exchange of individuals comes as a consequence of the tree-restructuring phase, carried out to maintain a specic hierarchical consistence (see Section 3.6). Now, consider the use of trees with other degrees of complexity. A binary tree-based population, for instance, would be formed by three-individual clusters only, with only two recombinations possibilities. This would degrade the multiple population character of the tree structure. Trees with a greater orderquaternary or more increase the multiple population character, but initial tests indicated that the performance does not improve at all, and moreover, the number of individuals rapidly jumps to prohibiting levels in terms of computational eort requirements. The best trade-o points to the selected ternary tree structure. In gure 2, we can see that each cluster consists of one single leader and three supporter agents. Any leader agent in an intermediate layer has both leader and supporter roles. The leader agent always contains the best solutionconsidering the number of tracks of all agents in the cluster. The number of agents in the population is equal to the number of nodes in the ternary tree, i.e. we need 13 individuals to make a ternary tree with three levels and 40 individuals to have four levels. For this work, we xed the population
3. Memetic algorithms Since the publication of John Hollands book, Adaptation in Natural and Articial Systems, the eld of Genetic Algorithms (GA) and the broader eld of Evolutionary Computation were clearly established as new research areas. However, other pioneering works could also be cited, as they became increasingly conspicuous in many engineering elds and in industrial problems. In the mid- 1980s, a new class of knowledgeaugmented GAs, also called hybrid GAs, started to appear in the computer science and engineering literature. The main idea supporting these methods is that of making use of other forms of knowledge, i.e. other solution methods already available for the problem at hand. As a consequence, the resulting algorithms had little resemblance to biological evolution analogies. Recognizing important dierences and similarities with other population-based approaches, some of them
16
A. Mendes and A. Linhares
Figure 3. Figure 2. Population structure.
Block Order Crossover (BOX) example.
size to be 13. This value might seem too low at rst glance, but after tests with 40 and 121 individuals, we concluded that 13 individuals are sucient to make the algorithm keep its convergence speed under control. The use of 40 or more individuals does not deteriorate the algorithms behaviour, but the computational eort increases considerably, as well as the CPU time. We must emphasize that the use of structured populations allows a reduction in population size without any loss of search power. In comparison, non-structured populations would require more than 100 individuals to achieve a similar performance, as related in other works (Franca et al. 2001, Mendes et al. 2002). 3.2. Representation and crossover Representation of the problem is quite intuitive. A solution is represented as a chromosome in which the alleles assume dierent integer values in the [1, n] interval, where n is the number of columns of the associated matrix. These values will dene the sequence (permutation) of the gates. The crossover tested is a variant of the well-known Order Crossover (OX) called Block Order Crossover (BOX). After choosing two parents, several fragments of the chromosome from one of them are randomly selected and copied into the ospring. In the second phase, the osprings empty positions are sequentially lled according to the chromosome of the other parent. The BOX resembles the second variant of the OX crossover presented in (Syswerda 1991). The procedure tends to perpetuate the relative order of the columns, although some alterations might appear (see gure 3). In gure 3, Parent A contributes two pieces of its chromosome to the ospring. These parts are thus copied to the same position they occupy in the parent. The blank spaces are then lled with the information of Parent B, going from left to right. The values in Parent B already present in the ospring are skipped, with only the new ones being copied. The percentage to be contributed from each parent is set at 50%. This means that the ospring will be created from information inherited in equal proportion from both
parents. In each generation, we create 26 new solutions, twice the number of agents present in the population. This number for the crossover rate, higher than that in other EAs, is due to the ospring acceptance policy. The acceptance rule means that several new individuals are discarded, in a high-infant-mortality selection process. Thus, after several tests, with values varying from 0.5 to 2.5, we decided to use 2.0. The insertion of new solutions in the population will be discussed later in (Section 3.6).
3.3. Mutation In our implementation, a traditional mutation strategy based on the swapping of columns was implemented. Two positions are selected uniformly at random, and their values are swapped. This mutation procedure is applied (on average) to 10% of all new individuals every generation. We have also implemented a heavy mutation procedure that executes the job swap move 10.g times in each individual, except the best one. This procedure is executed every time the population diversity is considered to be low, i.e. it has converged to individuals that are too similar (see Section 3.6).
3.4. Local search Local search algorithms for combinatorial optimisation problems generally rely on a neighbourhood denition that establishes a relationship between solutions in the conguration space. In this work, two neighbourhood denitions were chosen. The rst one was the all-pairs. This consists of swapping pairs of columns from a given solution. A hill-climbing algorithm can be dened by reference to this neighbourhood; i.e. starting with an initial permutation of all columns, every time a proposed swap reduces the number of tracks utilized, it is conrmed, and another cycle of swaps takes place, until no further improvement can be achieved. The second neighbourhood implemented was the insertion one. This involves removing a column from one position and inserting it in another position (which could include any point between a pair of gates, or the leftmost or rightmost extremes of the permutation).
A multiple-population evolutionary approach to gate matrix layout The hill-climbing iterative procedure is the same regardless of neighbourhood denition. For small instances, it is possible to evaluate every swap or insertion possibility, since the complexity remains low. As the instance size increases, however, it becomes imperative to implement neighbourhood reductions. For this problem, a simple neighbourhoodreduction scheme is to test only the k-nearest neighbours of each column for swap and insertion movements. For instance, if k 20, each column will be tested for swap and insertion with its 10 neighbours to the left and to the right. This procedure was implemented and greatly reduced the number of possibilities and the associated computational eort. Nevertheless, since the standard problem instances can have up to 141 gates (matrix columns), the complexity was still too large, and reducing the k value had a dramatic impact on the algorithms performance. The minimum k values necessary to reach optimal solutions for the larger instances were still too high, however, making the search process too time-consuming. A complementary procedure was then tailored. In this paper, we introduce a new local search reduction scheme, which discards useless swaps and insertion tries. The reduction is based on the position of the so-called critical columns. Critical columns dene the maximum number of tracks required by the solution. Figure 4 shows an example identifying such columns. The critical column is given by gate number 2, requiring six tracks to be implemented. The local searchreduction policy works by prohibiting any exchange or insertion that cannot aect the critical column. In the example above, movements involving any pair of columns extracted from the set {1, 8, 7, 4, 9} are not allowed, since they cannot decrease the number of tracks required by column 2. Likewise, movements between columns belonging to the set {3, 6, 5} are also prohibited. In the example, the number of possibilities of swaps and insertions without reduction is 108. Under the reduction scheme, this number drops to 49. If there is more than one critical column, the prohibited
17
movements are those for which both columns belong to the region before the far-left critical column and after the far-right one. As the computational complexity of the local search neighbourhoods is O(n2), the reduction will become particularly more appealing in the larger instances. The motivation of such scheme is that any exchange between columns that does not aect a critical column improves the number of tracks only locally, i.e. between the columns being moved, not globally. The reduction strongly improved the MA performance, making it reach better values with a much lower number of individuals evaluations. The use of critical columns information has also allowed the increase in number of nearest neighbours to the testedthe k value. This ensured a general better performance, continuously surpassing the previous approach to the problem (see below). The use of a local search in this problem is critical and had to be thoroughly studied. At rst, we tested the application of the local search on all new individuals created. The resulting algorithms performance did not improve considerably, despite the additional requirement for CPU time. An additional experiment applied a local search only to a small percentage of the new individuals. The results improved a little but were still disappointing, regardless of the percentage utilized (tests ranged between 10 and 90% of new individuals, over 10% of all steps). The very best results were obtained when the local search was applied only to the best individual of the population, after its convergence (the denition of population convergence is given in Section 3.6). The use of only one local search at the end of an evolutionary cyclefrom the population creation until its convergenceapparently has balanced the exploration/ exploitation characters of the algorithm better.
3.5. Selection for recombination The recombination of solutions in the hierarchically structured population can only involve a leader and one of its supporters within the same cluster. The recombination procedure selects any leader uniformly at random, and then it choosesalso uniformly at randomone of the three supporters. As mentioned before, the use of this selection scheme and the population structure introduces a multiple-population character within each population. The dynamics of the population is similar to several populations evolving in parallel and exchanging individuals when the tree restructuring takes place (see Section 3.6).
Figure 4.
Description of the critical column for a problem with nine gates and eight tracks.
18 3.6. Ospring insertion into the population
Once the leader and one supporter are selected, the processes of recombination, mutation, and local search take place, and an ospring is generated. The acceptance of the ospring follows two rules: (1) The ospring is inserted into the population replacing the supporter that took part in the recombination that generated it. (2) The replacement occurs only if the tness of the new individual is better than the supporter. This is an extremely elitist and restrictive policy, which generates a very fast loss of diversity. The positive side is that the algorithm becomes more focused early on and evolves faster. In order to deal with the accelerated loss of diversity, a more sensitive populationconvergence checking had to be employed. Generally, population convergence is evaluated by the similarity degree of the individuals chromosomes and/or tness. In this work, we concluded that if, during the recombination phase, no individual was accepted for insertion, the population had converged, and accordingly, the heavy mutation procedure would be applied. Finally, after the recombination phase, the population is restructured. The hierarchy described in Section 3.1 states that the degree of tness of the leader of a cluster must be lower than the tness of the leader of the cluster just above it. Following this rule, the higher clusters will have leaders with a better tness, and the best solution will be the leader of the root cluster. The adjustment is done comparing the supporters of each cluster with the leader. If any supporter turns out to be better than its respective leader, they swap places. In our problem, the higher the position that an individual occupies in the tree, the fewer the number of tracks it utilizes.
Figure 5. Two migration policies in an example with ve populations arranged in a ring structure.
to both populations connected to it, replacing randomly chosen individuals. Every population receives two new individuals. Figure 5 shows the migration policies. The ve populations are placed in a ring structure. In 1-Migrate, we have only one individual being received by each population. In 2-Migrate, this number rises to two individuals. The migration phase occurs always after all populations have converged, and the heavy mutation procedure was applied. As the strong mutation always preserves the best individual, it can migrate to the neighbouring population, carrying its genetic information. The comparison between the policies is, in fact, a comparison between none, weak, and strong migration, given the dierence of communication intensity. In 0-Migrate, every population will restart from the heavy mutation with just one high-quality individualthe best one previous to the heavy mutation process. In the 1-Migrate policy, every population will restart with two highquality individualsthe second individual is received from a neighbouring population. In 2-Migrate, the number of high-quality individuals increases to three.
4. Migration policies For the study with multiple populations, we had to dene how individuals migrate from one population to another. Three major population migration policies deserve mention: (1) 0-Migrate: No migration is used, and all populations evolve in parallel without any kind of communication or solution exchange. (2) 1-Migrate: Populations are arranged in a ring structure. Migration occurs in all populations, and the best individual from each migrates to the population right next to it, replacing a randomly chosen individual. Every population receives only one new individual. (3) 2-Migrate: Populations are also arranged in a ring structure. Migration does also occur in all populations, but the best individual of each migrates 5. Computational tests The computational tests were divided over two experiments. The rst tested the inuence of the number of populations on overall performance. For this evaluation, the number of populations varied from one to ve. The second experiment evaluated the inuence of migration on the algorithms performance: for each number of populations, the three migration polices were tested, totalling 15 congurations. The tests were applied into ve instances, and for each instance, we tested the whole set of congurations, 10 times for each. The MA was also tested against the GA, i.e. the MA without a local search, to evaluate the inuence of the local search in the results found. The stop criterion was a time limit, xed as follows: 10 for instance W2, 30 s for V4470 and X0; 90 s for W3;

Table 1. Instance name W2 V4470 X0 W3 W4 Information on the instances Number of nets 48 37 40 84 202 Best-known solution 14 9 11 18 27
19
Number of gates 33 47 48 70 141
Figure 6.
Data elds for each conguration.
and 40 min for W4. The dierence of maximum CPU times is due to the dimension of the instances and takes into account the average time to nd high-quality solutions. The reader should note that these instances are taken from real industrial circuits, and each implements a specic logic: for example, in these sets there are logic functions for decoders (v4050, v4090), 4-b comparators (v4470), full-adders (wan), ALUs(W3), ITT1s and ITT2s (W2, W4). We refer the reader to Hu and Chen (1990) for details. Table 1 lists information on the instances examined in this work: one small three medium-sized, one large. There are very few large instances in the literature on which to test methods. Linhares et al. (1999) presented the most extensive computational tests, with 25 instances in total. However, most of them were too small and easy to solve either to optimality or to the best-known solution with the algorithm. Considering the instances sizes, only V4470, X0, W2, W3, and W4 had more than 30 gates, and for this reason, we concentrated the study on them. We must emphasize that the best MA results, in terms of the number of tracks required to build the circuits, are the same as those found by the microcanonical optimization approach. The only performance dierence was the higher eciencymeasured in terms of number of evaluations in nding such values. Next, we show the experimental results. Four numbers describe the results for each conguration (see gure 6). In clockwise order, we have in boldface the best number of tracks found for that instance. Next in the sequence, we display the number of times this solution was found in 10 tries, the worst value found for the conguration, and nally, in the lower-left part of the cell, the average value found for the number of tracks. All tests were carried out on a Pentium 366-MHz Celeron computer, using Sun JDK 2.0 Java language running in a Windows environment. We estimate that the performance of a Pentium Celeron processor lies somewhere between a Pentium and a Pentium II processor. Instance W2 reached the optimal value in all congurations of migration policy and number of populations
in the MA. For the GA, the algorithm found the optimal solution in all congurations, although not in all trials. The best conguration for both MA and GA was the 1-Migrate with four populations because it required the lowest number of individuals evaluations. The other instances have shown more noticeable dierences in performance, and their results are listed in tables 25. Since the tests take into account two parameters migration policy and number of populationslet us begin by evaluating the rst one. At rst, two aspects of randomised search algorithms must be pointed out: exploitation and exploration (see also the debate on Linhares and Yanasse (2002b). Exploitation is the property of the algorithm to thoroughly explore a specic region of the search space, looking for any improvement in the best currently available solution(s). Exploration is the property to explore wide portions of the search space, looking for promising regions, where exploitation procedures should be employed. With no migration, we observed more instability in the answers, expressed by worst solutions and averages found for instances W3 and W4. In the smaller circuits, this feature was not so clear. By contrast, the 1-Migrate apparently balanced exploitation and exploration better, returning good average and worst-solution values. A more stable algorithm was also found with highquality solutions usually being found. The 2-Migrate policy did not perform so well, with a clear degradation of both the average and the worst solution. This might have been caused by an overtly strong exploitation, in detriment to the exploration. Thus analyzing only the migration policies, we concluded that migration should be set at medium levels, represented by the 1-Migrate. The second aspect to be analysed is the number of populations. Although it is not clear which conguration was best, the use of only one is surely not the best choice, since several multi-population congurations returned better values. Based on the results, the conclusion is that when multiple populations are utilized, at least three should be employed. With only two populations, the algorithm does not seem to take advantage of the genetic drift eect. Under the optimization point of view, the use of multiple populations allows much larger portions of the search space to be explored,
20
Table 2.

Results for the V4470 instance. Parameters: k 20; maximum CPU time 30 seconds. The best conguration was 1-Migrate with 3 populations for both MA and GA Number of populations 1 2 2 11 9 10.1 9 9.9 9 9.5 10 10.4 10 10.1 9 10.3 1 11 2 11 5 10 6 11 9 11 1 11 9 9.6 9 9.5 9 9.7 10 10.3 9 10.5 9 10.3 3 4 10 6 11 4 11 7 11 2 11 1 11 9 10.0 9 9.6 9 9.8 10 10.4 10 10.1 10 10.3 4 1 11 5 11 2 10 6 11 9 11 7 11 9 9.5 9 9.6 9 9.5 10 10.4 9 10.2 10 10.2 5 5 10 4 10 5 10 6 11 1 11 8 11
V4470-MA 0-Migrate 1-Migrate 2-Migrate V4470-GA 0-Migrate 1-Migrate 2-Migrate
9 10.1
10 10.5
5 11
Table 3.
Results for the X0 instance Number of populations
1 X0-MA 0-Migrate 1-Migrate 2-Migrate X0-GA 0-Migrate 1-Migrate 2-Migrate 11 11.6 6 13 11 11.1 11 11.0 11 11.2 11 11.5 11 11.1 11 11.1
2 9 12 10 11 8 12 5 12 9 12 9 12 11 11.2 11 11.0 11 11.1 11 11.4 11 11.2 11 11.1
3 8 12 10 11 9 12 6 12 8 12 9 12 11 11.1 11 11.0 11 11.2 11 11.5 11 11.2 11 11.0
4 9 12 10 11 8 12 5 12 8 12 10 11 11 11.1 11 11.1 11 11.1 11 11.6 11 11.2 11 11.1
5 9 12 9 12 9 12 4 12 8 12 9 12
11 11.8
2 12
Parameters: k 20; maximum CPU time 30 s. The best conguration was 1-Migrate with four populations for the MA and 2-Migrate with four populations for the GA.
but both the MA and the GA require at least three populations to make the genetic drift eect noticeable. The other characteristic of the method was its high eciency. The number of individuals evaluations was very low. Comparing this with the previous work of Linhares et al. (1999), the improvement is striking. The microcanonical optimization presented in that work is based on a fast variant of the well-known simulated annealing approach, which divides the search into two alternating phases: initiation and sampling. These phases have dual objectives: in initiation, the system
strives to rapidly obtain a new locally optimum solution, while in the sampling phase, the system moves out of the local optimum while retaining similar cost values (as controlled by parameters analogous to the temperature in simulated annealing). The proposed algorithm was able to outperform ve previous approaches in all previously tested instances. For details, see Linhares et al. (1999). Table 6 compares both methods. Both the memetic algorithm and the microcanonical optimization either matched or outperformed the pre-

Table 4. Results for the W3 instance Number of populations 1 W3-MA 0-Migrate 1-Migrate 2-Migrate W3-GA 0-Migrate 1-Migrate 2-Migrate 18 20.0 3 23 18 20.0 18 20.0 18 20.1 18 20.7 19 20.5 18 20.8 2 3 23 3 22 4 23 1 22 3 22 1 22 18 19.1 18 18.9 18 18.6 18 20.5 19 20.3 18 21.2 3 5 20 5 20 7 21 1 22 5 23 1 23 18 18.4 18 18.6 18 18.5 19 20.4 18 20.5 18 20.9 4 7 20 6 21 6 20 5 22 1 22 1 22 18 18.8 18 18.1 18 18.9 19 20.4 19 20.6 18 20.8 5
21
5 20 9 19 4 22 4 22 3 23 2 22
18 21.2
1 23
Parameters: k 30; maximum CPU time 90 s. The best conguration was 1-Migrate with ve populations for the MA and 2-Migrate with ve populations for the GA.
Table 5.
Results for the W4 instance Number of populations
1 W4-MA 0-Migrate 1-Migrate 2-Migrate W4-GA 0-Migrate 1-Migrate 2-Migrate 29 30.5 4 34 28 30.9 29 30.9 28 31.2 29 31.1 29 31.0 29 31.4
2 1 36 2 34 1 36 4 36 4 34 2 34 28 30.6 28 30.6 28 30.0 29 31.5 29 31.3 29 31.4
3 1 34 2 35 4 34 3 34 3 36 2 35 28 30.4 27 29.4 28 29.7 29 31.5 29 30.5 29 31.2
4 2 35 2 34 2 32 3 33 5 33 3 34 28 30.3 27 30.2 28 29.4 29 31.4 29 31.2 29 31.3
5 2 34 1 34 3 30 4 34 3 35 2 34
29 31.5
2 35
Parameters: k 60; maximum CPU time 40 min. The best conguration was 1-Migrate with four populations for both MA and GA.
vious heuristics in the literature, simulated annealing, microcanonical annealing, GM-Plan, GM-Learn, and constructive heuristics, in terms of the quality of solutions obtained. Using the memetic algorithm, it was possible to obtain those high-quality solutions, however, with a fraction of the eort required in microcanonical optimization. The MA yielded better values than the microcanonical approach. It is worth mentioning that the reduction ranges between 80 and 90%, depending on the instance size. There are at least three reasons for these outstanding results. First, the local search embedded in the
algorithm does not consider all possible movements, as a neighbourhood reduction enables considerable computational gains (without any major loss of solution quality). Second, the idea of discarding all possible movements that do not aect critical columns (and hence cannot improve solution quality) also is a major step towards greater eciency. Finally, a local search is carried out only for the best individuals of each population. These factors, taken in combination, help to explain the superior performance of this algorithm. The direct comparison between the MA and the GA shows the importance of the local search, especially
22
Table 6.

Comparison of the number of individuals evaluations between the MA and the Microcanonical approach presented in Linhares et al. (1999) Number of evaluations Memetic algorithm Min 3125 32509 18136 79089 3213532 Ave 3523 176631 43033 203892 9428591 Max 5398 451377 117384 495306 15643651 Min 12892 102976 52126 1700667 24192291 Microcanonical approach Ave 19839 1109036 95253 7143872 167986282 Max 26541 2714220 187335 21289846 405324093
W2 V4470 X0 W3 W4
Table 7. Statistics on the MA related to the number of individuals evaluations (gures are averages) k value W2 V4470 X0 W3 W4 10 20 20 30 60 Evaluations per local search 379 548 585 749 3242 Evaluations per second 6906 5702 5557 3374 662
when the instances become larger. Considering the W4, for example, the GA was not able to nd the value of 28 tracks even once. This means it missed the best value by two tracks. We believe that such failures tend to increase as the instances become larger because of the critical reduction of search power in the absence of a local search operator. Table 7 presents some MA-related statistics to give more information about the speed of the algorithm.
parallel techniques to distribute the populations and/or individuals through a computer network. Moreover, comprehensive studies on the local search theme must be carried out, since it is a critical part of the MA, and its inuence on the general performance is very signicant. This algorithm is included in a framework for general optimization called NP-Opt (Mendes et al. 2001). Although the MA utilizes special features, like the new local search, it runs in this general optimization environment, which facilitates the programming of optimization methods, but it reduces code exibility and generally does not allow data-structure tricks, very useful and common when dealing with intensive processing tasks like optimization. We believe that further development and study of such frameworks may ultimately provide an invaluable tool to the eld of systems science.
6. Conclusions This work presented a multiple-population evolutionary approach to solve the Gate Matrix Layout problem. We used a memetic algorithm and a genetic algorithm as search engines. The numerical results, obtained from a standard set of industrial-sized instances, were very encouraging, and the best multi-population congurations require at least three populations evolving in parallel and exchanging individuals at a medium rate. The MA was successful in solving a very complex systems science problem, obtaining solutions with a quality that rivals that of the previous methods, while dramatically decreasing the computational eort involved. The number of evaluations was reduced by at least a factor of eight. The GA, however, was not able to match the MA performance in the larger instances. Future studies should include the use of
Acknowledgements This work was supported by the NBI program of the ` University of Newcastle, Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESPBrazil) and the PROPESQUISA program of the Getulio Vargas Foundation.
References
CANTU-PAZ, E., 1999, Topologies, Migration Rates and MultiPopulation Parallel Genetic Algorithms. Technical Report No. 97007, Illinois Genetic Algorithms Laboratory (ILLIGAL), University of Illinios, USA. CANTU-PAZ, E., 2000, Ecient and Accurate Parallel Genetic Algorithms (Dordrecht: Kluwer Academic). DARWIN, C. R., 1993, The Origin of the Species (Random House). DAWKINS, R., 1976, The Selsh Gene (Oxford: Oxford University Press). DOWNEY, R. G., and FELLOWS, M. R., 1995, Fixed parameter tractability and completeness 1. Basic results. SIAM Journal on Computing, 24, 873921. FELLOWS, M. R., LANGSTON, M. A., 1987, Non-constructive advances in polynomial-time complexity. Information Processing Letters, 26, 157162.

FOO, S. K., SARATCHANDRAN, P., and SUNDARARAJAN, N., 1999, An evolutionary algorithm for parallel mapping of backpropagation learning on heterogeneous processors. International Journal of Systems Science, 30, 309321. FRANCA, P. M., MENDES, A., and MOSCATO, P., 2001, A memetic algorithm for the total tardiness single machine Scheduling problem. European Journal of Operational Research, 132, 224242. HASHIMOTO, A., and STEVENS, J., 1971, Wire routing by optimising channel assignment within large apertures. Proceedings of the 8th Design Automation Conference, pp. 155169. HU, Y. H., and CHEN, S. J., 1990, GM_Plan: A gate matrix layout algorithm based on artificial intelligence planning techniques. IEEE Transactions on Computer-Aided Design, 9, 836845. LENGAUER, T., 1990, Combinatorial Algorithms for Integrated Circuit Layout (New York: John Wiley). LINHARES, A., 1999, Synthesizing a predatory search strategy for VLSI layouts. IEEE Transactions on Evolutionary Computation, 3, 147152. LINHARES, A., and YANASSE, H. H., 2002a, Connections between cutting-pattern sequencing, VLSI design, and flexible machines. Computers & Operations Research, 29, 17591772. LINHARES, A., and YANASSE, H. H., 2002b, Local search intensity versus local search diversity: a false trade-off? Manuscript submitted for publication. LINHARES, A., YANASSE, H., and TORREAO, J., 1999, Linear Gate Assignment: a fast statistical mechanics approach. IEEE Transactions on Computer-Aided Design on Integrated Circuits and Systems, 18, 17501758. LUK, B. L., GALT, S., and CHEN, S., 2001, Using genetic algorithms to establish efficient walking gaits for an eight-legged robot. International Journal of Systems Science, 32, 703713. MENDES, A. S., FRANCA, P. M., and MOSCATO, P., 2001. NP-Opt: an optimisation framework for NP problems. Proceedings of
23
POM2001International Conference of the Production and Operations Management Society, pp. 8289. MENDES, A. S., MULLER, F. M., FRANCA, P. M., and MOSCATO, P., 2002, Comparing meta-heuristic approaches for parallel machine scheduling problems. Production Planning & Control, 13, 143154. MOHRING, R. H., 1990, Graph problems related to gate matrix layout and PLA folding. Computing, 7, 1751. MOSCATO, P., 1989, On evolution, search, optimisation, genetic algorithms and martial arts: towards memetic algorithms. Caltech Concurrent Computation Program, C3P Report 826. MOSCATO, P., and NORMAN, M. G., 1992, A memetic approach for the travelling Salesman Problem. Implementation of a computational ecology for combinatorial optimisation on message-passing systems. In M. Valero, E. Onate, M. Jane, J. L. Larriba, and B. Suarez (eds.) Parallel Computing and Transputer Applications (Amsterdam: IOS Press), pp. 187194. NAKATANI, K., FUJII, T., KIKUNO, T., and YOSHIDA, N., 1986, A heuristic algorithm for gate matrix layout. Proceedings of International Conference of Computer-Aided Design, pp. 324327. SYSWERDA, G., 1991, Schedule optimization using genetic algorithms. In Handbook of Genetic Algorithms (New York: Van Nostrand Reinhold), pp. 332349. VIENNET, R., FONTEIX, C., and MARC, I., 1996, Multicriteria optimisation using a genetic algorithm for determining a pareto set. International Journal of Systems Science, 27, 255260. WEINER, J., 1995, The Beak of the Finch (New York: Vintage Books). WING, O., HUANG, S., and WANG, R., 1985, Gate matrix layout. IEEE Transactions on Computer-Aided Design, 4, 220231. XIONG, N., 2001, Evolutionary learning of rule promises for fuzzy modelling. International Journal of Systems Science, 32, 11091118. YANASSE, H. H., 1997, On a pattern-sequencing problem to minimize the number of open stacks. European Journal of Operational Research, 100, 454463.

A Multiple-Population Evolutionary Approach To Gate Matrix Layout

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

A Multiple-Population Evolutionary Approach To Gate Matrix Layout

Enviado por

Direitos autorais:

Formatos disponíveis

International Journal of Systems Science volume 35, number 1, 15 January 2004, pages 1323

A multiple-population evolutionary approach to gate matrix layout

Translation from a given instances solution into the real circuit.

A. Mendes and A. Linhares

Figure 3. Figure 2. Population structure.

Block Order Crossover (BOX) example.

18 3.6. Ospring insertion into the population

A. Mendes and A. Linhares

A multiple-population evolutionary approach to gate matrix layout

Number of gates 33 47 48 70 141

Data elds for each conguration.

A. Mendes and A. Linhares

V4470-MA 0-Migrate 1-Migrate 2-Migrate V4470-GA 0-Migrate 1-Migrate 2-Migrate

Results for the X0 instance Number of populations

2 9 12 10 11 8 12 5 12 9 12 9 12 11 11.2 11 11.0 11 11.1 11 11.4 11 11.2 11 11.1

3 8 12 10 11 9 12 6 12 8 12 9 12 11 11.1 11 11.0 11 11.2 11 11.5 11 11.2 11 11.0

4 9 12 10 11 8 12 5 12 8 12 10 11 11 11.1 11 11.1 11 11.1 11 11.6 11 11.2 11 11.1

A multiple-population evolutionary approach to gate matrix layout

Results for the W4 instance Number of populations

2 1 36 2 34 1 36 4 36 4 34 2 34 28 30.6 28 30.6 28 30.0 29 31.5 29 31.3 29 31.4

3 1 34 2 35 4 34 3 34 3 36 2 35 28 30.4 27 29.4 28 29.7 29 31.5 29 30.5 29 31.2

4 2 35 2 34 2 32 3 33 5 33 3 34 28 30.3 27 30.2 28 29.4 29 31.4 29 31.2 29 31.3

A. Mendes and A. Linhares

A multiple-population evolutionary approach to gate matrix layout

Você também pode gostar