Você está na página 1de 5

Evolution of Artificial Agents in a Realistic Virtual Environment

Jonathan R. Hicks
Department of Computer Science Middle Tennessee State University Murfreesboro, TN 37132 jh2f@mtsu.edu

Joseph A. Driscoll
Department of Computer Science Middle Tennessee State University Murfreesboro, TN 37132 driscoll@mtsu.edu
powerful environments for experimentation with a particular branch of AI: evolutionary algorithms (EAs). EAs are methods inspired by biological evolution [1] to obtain solutions to many

ABSTRACT
Modern video game software implements very immersive, simulated worlds. These worlds benefit from highly sophisticated graphics and physics simulations provided by the software. The experiments presented here show that these complex simulated environments can provide an attractive research platform for evolutionary algorithms, specifically for the task of artificial agent evolution.

Categories and Subject Descriptors


I.2 [Artificial Intelligence]: Distributed Artificial Intelligence Multiagent Systems.

types of problems [7]. Such algorithms obtain solutions by searching through the space of all possible solutions. The search is not exhaustive, but rather guided by promising candidate solutions. Section 2, below, provides an overview of EAs. The work presented here focuses on the AI-controlled bots that serve as additional players in a video game environment. Such bots are capable of complex behaviors, and these behaviors are coordinated through a set of configuration values called traits. To create a new and more interesting bot, a human must change these values manually and then test the altered bot. This tedious process is then repeated many times until a bot is found that exhibits the desired behaviors. An evolutionary algorithm is well-suited to this sort of process [3], [7], and is the approach taken in this paper. The question is not simply whether or not an EA can be implemented in a game environment. Rather, this work goes further to ask whether the unique characteristics of an EA will still be present in such an environment. In other words, does the EA still behave like an EA in such an environment? Such fidelity is required if the EA/game environment is to be used for real research. Through three sets of experiments it is shown that our system is able to exhibit characteristics of evolutionary algorithms, which demonstrates that the system used, and games in general, can provide powerful environments for EA research.

General Terms
Design, Experimentation

Keywords
Evolutionary algorithms, virtual environments, artificial agents

1. INTRODUCTION
Video game software is highly sophisticated, owing to the complex graphics, physics, and other techniques employed [5]. Many of these games immerse players in very realistic environments, often using complex models of physics to accurately depict the various aspects of gameplay. In many of these games human-controlled players interact with each other as well as with artificially-intelligent players called bots. These games therefore represent highly-realistic, multi-agent environments. With the ability to customize the bots, such game environments are ideal for experimentation in many types of artificial intelligence (AI) research [5], [6]. The use of an existing, realistic simulation environment such as a video game allows researchers to focus on designing experiments rather than designing simulators. The goal of this work was to show that video games can serve as

1.1 Paper Overview


The next section provides an overview of evolutionary algorithms. Section 3 discusses the experimental methodology employed, and section 4 discusses the results that were obtained. Finally, the paper ends with conclusions drawn, as well as thoughts on future work.

2. EVOLUTIONARY ALGORITHMS
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ACM Southeast Conference 05, March 18-20, 2005, Kennesaw, AL, USA. Copyright 2005 ACM 1-59593-059-0/05/0003$5.00.

Evolutionary algorithms (EAs) are methods inspired by theories of biological evolution that are able to find solutions to a wide variety of problems. Example applications include oil pipeline layout [2], robot control [8], and computer program induction [4]. These algorithms obtain results by intelligently searching through the space of possible solutions. The search is neither exhaustive nor random; it is guided by promising candidate solutions in a manner that mimics biological evolution.

2-365

To use an EA, one must first create a data structure to represent the form of possible solutions. This data structure is called a genome and is the general structure shared by all candidate solutions in a particular EA application. The genome contains variables (genes) rather than specific values. A particular solution has specific values (alleles) in these variables. As an example, assume that an EA is being used to find a set of three numbers that optimize a function. The genome for this task can simply be a list of three variables, (x, y, z), where each variable can contain a number. A candidate solution to this problem has particular values for the three numbers, for example (0.1, 0.4, 0.2). Some solutions will be better than others, or more fit, in terms of the goal (i.e., to optimize the function). In this example, the EA searches through a three-dimensional space to find a good solution. The search through this space is not random. When good solutions are found, new solutions are generated by modifying those good solutions. In an evolutionary algorithm many candidate solutions, called individuals, are considered simultaneously. The complete set of individuals under consideration at a given time is called a population. In the example above, an individual is a specific list of numbers. The population is the collection of all the lists of numbers currently under consideration. An EA begins with a random initial population. This population contains individuals that were generated by selecting random allele values for the genes. Each individual in the population is evaluated by a performance metric called a fitness function. This function reflects how good, or fit, each individual is as a solution to the task at hand. A selection process uses the fitness values to decide which of the individuals in the population will be selected to become parents. Many selection methods exist, but most are similar in that individuals with higher fitness are chosen as parents more often than less fit individuals. One popular selection method (and the one used in the current experiments) is tournament selection. In tournament selection, parents are chosen by repeatedly sampling the population using tournaments. A number of individuals are chosen at random. The particular number of individuals chosen is called the size of the tournament. The best individual (in terms of fitness) present in the chosen group is said to have won the tournament, and is selected as a parent. These tournaments continue as long as more parents are needed. By altering the size of the tournament, one can adjust how difficult it is to be selected, which is called the selection pressure. The selected parent individuals are allowed to breed in order to produce new individuals, called offspring. This is done through the use of genetic operators such as crossover and mutation. The main idea behind crossover is that two parents may have good building blocks (sets of alleles) that contribute to their fitness in a good way [2], [7]. Imagine the parental allele sets as two lists of values. Crossover selects a point, called a crossover point, in the lists (the same location for both lists). Offspring are generated by copying these parent lists. When the copying process reaches a crossover point, copying switches to the other parent. For example, prior to reaching a crossover point, offspring 1 is identical to parent 1, and the same for offspring 2 and parent

2. After the crossover point, allele values from parent 2 are copied to offspring 1. Similarly, alleles from parent 1 go to offspring 2. By moving entire subgroups of allele values from the parents to the offspring, crossover may combine good building blocks into one or both offspring, in such a way as to make the offspring even more fit than the parents. Of course, often the offspring are less fit, since the choice of crossover point is often random. But if a better offspring is produced even occasionally, then its better fitness should allow it to be chosen as a parent and therefore propagate its favorable allele set to future generations. Reproduction and mutation each require only a single parent. Reproduction simply creates an identical copy of a parent to generate an offspring. Mutation takes a parent and randomly changes one or more allele values to new, random values. Mutation is often considered to be needed since it can introduce a new allele value for a gene into the population [7]. Crossover and reproduction cannot do this, as they simply copy from parents. Mutation is the source of new alleles, and so is an important component of the exploration of the search space. Each time an offspring is to be created, an operator is chosen at random (with a probabilistic rate) to generate the offspring. These rates define the probabilities of each operator being used to generate the offspring. Sometimes multiple operators are used simultaneously. For example, two offspring can be produce by crossover, and then these offspring can be subjected to mutation. The newly created offspring make up a new population. This new population replaces the old, and this completes one generation of the evolutionary algorithm. The process repeats until some userspecified termination criterion is met, such as the creation of an individual with some desired level of fitness. The diagram in Figure 1 summarizes this process.

Generate Initial Random Population

Evaluate Fitness of Individuals

Select Individuals To Serve as Parents

Breed Parents to Produce Offspring NO YES End Evolution Termination Criterion Met? Offspring Become Next Generations Population

Figure 1. General evolutionary algorithm. A central component of EAs is that offspring resemble parents. If parents are highly fit, then offspring are often highly fit also. Sometimes the offspring are less fit, but other times they are even more fit than the parents. Individuals with higher fitness values are more likely to be selected as parents than those with low fitness values, and so highly fit offspring are likely to become parents themselves. In this way the fit individuals guide evolution toward increasingly good solutions.

2-366

3. EXPERIMENTAL METHODS
This section presents the experimental methodology used for this work. The simulation environment is described, followed by an overview of the particular evolutionary algorithm used. Finally the three sets of experiments are discussed. Experimental results are presented and discussed in the next section.

3.3 Types of Experiments


Three sets of experiments were performed, each set more sophisticated than the last. In total, the experiments were designed to detect EA-specific behavior from the system. The use of multiple sets of experiments allowed comparisons to be made as complexity was added, and therefore allowed better analysis of the impact of the added sophistication. The types of added complexity are described below. Results from these experiments are presented and discussed in the next section.

3.1 The Simulation Environment


Quake III [9] is a realistic multiagent gaming environment that is also ideal for artificial agent research, and so was the chosen environment for these experiments. This game consists of combat between players, resulting in a survival of the fittest style of gameplay. The combat is sophisticated, using a variety of weapons and fighting styles. When one agent defeats another, the winning agents score increases by a point. Typically games are played for a set amount of time, and at the end of the game players are ranked according to their scores. In addition to human-controlled agents, artificially-intelligent agents called bots can also be part of the game. When the game is played by a single human player, all opponents are bots, and so it is important that the bots show a variety of interesting behaviors. In Quake III, bot behavior is controlled by a set of values called traits. Each trait value ranges from 0 to 1. The values include personality traits such as aggressiveness, as well as weapon preferences. The experiments described here are based on setting fifteen of these trait values. The other trait values did not significantly impact bot performance, and instead influenced such things as how often a bot speaks.

3.3.1 First Set of Experiments: Establish a Baseline of Performance


To evaluate the EA/game system, a benchmark problem was needed. The approach in the first set of experiments was to simply evolve trait values that maximize bot game scores. High values for traits tend to produce good bot performance and so without some balancing constraint, the EA would likely just place maximum values (i.e., 1) in all the traits. In the first set of experiments, the bots were allowed to evolve trait values without the complexities of the later experiments, which are described below. This established the types of results produced without such additions, and so provided a basis for comparison with the later experiments. Though simple, these first experiments were also able to verify that the system can, at the least, perform simple optimization, which any EA should be able to do [2]. Subsequent experiments were designed to determine if more advanced EA behaviors were possible.

3.2 The Evolutionary Algorithm


These experiments used a standard genetic algorithm with tournament selection, as described in Section 2. Parameters such as operator rates were set to commonly-used values (80% for crossover, 10% reproduction, 10% for mutation), as found in the EA literature [2], [4], [7]. While the uniqueness of the game/EA environment may mean that different parameters could have been better, such exploration was beyond the scope of these experiments, and may be the basis for future work. In any case, the settings used here were sufficient to demonstrate the efficacy of the environment for EA research, which was our goal. Each individual in the EA is a one-dimensional array of numbers, and represents a particular setting of bot trait values. There were 30 total traits for the first two experiments and 35 traits for the last experiment. The five extra traits were inserted into the middle of the chromosome starting at position 15. The population size was set to 70, as this was the maximum number of different bots that the game would accept. To evaluate the fitness of individuals, the trait sets were used to create a corresponding set of bots for the game. Next, using tournament selection, random subsets of these bots were allowed to compete in a game for 5 minutes. The bots scores at the end of the game were used as fitness values; the top two scoring bots were selected as parents. Since 70 parents were needed to generate the 70 offspring (for the next generation of the EA), 35 such games must be performed each generation.

3.3.2 Second Set of Experiments: The Knapsack Constraint


Simple maximizing of traits, as with the first set of experiments, is not very interesting in isolation, as other, non-EA methods could achieve the same result. However, the first set of experiments does serve as an important basis of comparison with the later experiments. For the second set of experiments, a more challenging problem was needed: producing good bots that were subject to a constraint. Bots with all traits set to high values are highly skilled, but are often too difficult, and thus not enjoyable, for human players. In manually designing bots that humans find entertaining, it is observed that often some traits will be high while others are kept low, as if there is only a fixed amount of trait points to allocate. This restriction of the total trait values provides the needed constraint for our benchmark problem: the evolution of good bots subject to the trait-limit constraint. The standard knapsack problem [2], described next, provides a natural way to incorporate such a constraint. The knapsack is a standard problem used to evaluate the performance of many types of optimization techniques, including EAs [2]. This problem involves a container (the knapsack) and a set of items with varying values and sizes. The knapsack has a fixed size, and so cannot hold all of the items at the same time. The goal is to find the subset of items that have the highest collective value, while still fitting into the knapsack. The structure of the knapsack problem provides an effective way to implement the constrained bot optimization problem, which is the basis of the second set of experiments. In the version of the knapsack problem used here, each bot has a fixed number of

2-367

points (10) to distribute to the various traits. These experiments go beyond the first set, in that they show that the system is capable of performing constrained optimization, which is another important characteristic of EAs [2], [7].

The average best score was 70.27, and the average number of generations to reach the best individual was 43.07. The utility of these results is in their use as a basis of comparison with the other types of experiments. In addition, they confirm that the EA can at least optimize traits in a simple manner. The remaining experiments provide more insight on the EAs abilities.

3.3.3 Third Set of Experiments: Useless Genes


The knapsack constraint allowed the evolution of interesting bots subject to a constraint, but further evidence was desired that the process exhibited evolutionary dynamics, rather than those of a more general constrained optimization process. To further examine this issue, the final set of experiments was designed to show further behavior specific to EAs. In this set, the EA individuals were modified to include a number of extra, or useless, genes in their genome. These gene values were associated with how often a bot would taunt an opponent. Since this feature was turned off in the server these trait values were of no use to a bot. These genes were useless in the sense that their values did not correspond to any bot trait values, and so any trait points allocated there are wasted. One way to show that the system exhibits EA dynamics is if, during evolution, the bots adapted to the useless genes by reducing how many points are allocated to them. Such adaptation is an example of EA behavior. The useless genes also allow the observation of another EA behavior: the EA can use the useless genes as introns. Introns are the biological analogs of the useless genes, and are regions that do not appear to correspond to anything biologically useful [4], [7]. However, such introns may serve an important role in protecting against destructive crossover. Imagine that some trait settings are only good when in the same individual. In destructive crossover with bots these traits are separated, thus leading to a drop in the offspring fitness (as compared to parental fitness). In contrast, constructive crossover occurs if the offspring fitness exceeds that of the parents. If there is no difference between parental and offspring fitness, the crossover is called neutral. Introns (and the experimental useless genes) can reduce the frequency of destructive crossover by serving as additional targets for the randomly-selected crossover point. In the EA, if the frequency of destructive crossover decreases after the introduction of useless genes, then this shows that the system is exhibiting this important class of evolutionary behavior.

4.2 Knapsack
The same two averages computed for the first set were also computed for this set: the average best individual score, and the average number of generations to reach a best individual. For this set, these numbers were 73.77 and 81.67, respectively. The average best scores for the first two sets are similar. However, the second set of experiments took more generations to produce such bots than the first set. This difference is significant (t-test with 95% significance level) and shows that the second set took longer to reach a similar level of performance. The second set of experiments used the knapsack constraint (but did not use the useless traits), and so allowed a more sophisticated form of evolution. Skilled bots were evolved, but evolution took longer than with the first experiments (which did not suffer from any constraints). This shows that the constraint made the problem more difficult, but the EA was still able to solve it. This success demonstrates the EAs ability to perform constrained optimization, which is further evidence of EA-like behavior. More evidence is provided by the final set of experiments, discussed next.

4.3 Useless Genes


As described earlier, the experiments with useless genes were designed to detect two types of EA behavior. First, would the EA adapt to not wasting points in the useless genes? Second, would the presence of the useless genes reduce the frequency of destructive crossover?

4.3.1 Wasting of Points


The experimental results included information on how many points were allocated to useless genes. By examining how these values changed over the course of evolution, it could be seen if the EA adapts and so reduces the allocation of points to the useless genes. For each of the 15 trials with useless genes, the following procedure was performed. The number of trait points in useless genes was recorded for each generation. These values were averaged in two ways: over the first 10 generations and over the last 10 generations. This produced an early average and a late average, respectively, for each of the 15 trials. The early average reflects the number of wasted points (i.e., in useless genes) early in evolution, while the late average shows the number for the end of evolution. Next, the 15 early averages were averaged, to produce an overall early average for all 15 runs. The same was done for the late averages. Now there were two average values to compare: the overall early average, which was 2.64 points, and the overall late average, which was 2.40 points. This difference suggests that over the course of evolution, the number of points in useless genes decreased, showing that the EA adapted to avoid wasting points in the useless genes.

4. RESULTS
In this section the results of the three experimental sets are discussed. For statistical significance [10], each of the three experiment types was performed 15 times, for a total of 45 experiments performed. Other details of statistical methods are given in the following discussion.

4.1 Baseline
The 15 experiments in this group each had a best individual, which had an associated fitness score. By averaging this best score across the 15 trials, an average best score was produced for these experiments. In addition, the generation number at which the best individual was produced was recorded for each generation. These values were averaged, producing an average number of generations before the best individual was produced.

2-368

4.3.2 Introns
To determine if the EA adapted to use useless genes to protect against destructive crossover (i.e., using the useless genes as introns), the destructive crossover rates were examined for the second and third experimental sets. These sets differed only in that the second set did not use useless genes, while the third set did. Both of these sets used the knapsack constraint. For a given experimental set, the destructive crossover rates for each of the 15 experimental trials were averaged. This produced two averages: the average destructive crossover rate for the experiments without useless genes, and the average for the experiments with useless genes. See Table 1 for a summary of the results. With the useless genes, destructive crossover went down in frequency, as compared to the experiments without useless genes. In addition, the useless genes allowed the non-destructive forms of crossover (constructive and neutral) to increase in frequency.

useless genes themselves had an effect similar to that of introns, protecting against destructive crossover. Both features are indicators of evolutionary behavior, and so support the overall goal of this work: to show that the EA, as implemented in the video game, can exhibit complex evolutionary behaviors needed for EA research.

6. REFERENCES
[1] Darwin, C. R. (1859). On the Origin of Species. John Murray, London. [2] Goldberg, D. E. and Kuo, C. H. (1987). "Genetic Algorithms in Pipeline Optimization," Journal of Computers in Civil Engineering 1, 128-141. [3] Holland, J.H. (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press. [4] Koza, J. R. (1992). Genetic Programming: On the Programming of Computers by Natural Selection. MIT Press, Cambridge, MA. [5] Laird, J. E. (2002). Game engines in scientific research: Research in human-level AI using computer games. Communications of the ACM, Volume 45, Issue 1.

Table 1. Effect of Introns on Crossover. Set 2 Constructive Neutral Destructive 12.1% 0.796% 87.1% Set 3 12.2% 0.899% 86.9%

[6] Magerko, B., Laird, J. E., Assanie, M., Kerfoot, A., Stokes, D. (2004). AI Characters and Directors for Interactive Computer Games. Proceedings of the 2004 Innovative Applications of Artificial Intelligence Conference, San Jose, CA, July 2004. AAAI Press. [7] Mitchell, M. (1996). An Introduction to Genetic Algorithms. The MIT Press, Cambridge, Massachusetts.

5. CONCLUSIONS
This work has shown that a video game environment can be successfully used as a research platform. These immersive, threedimensional worlds are an ideal environment for advanced software AI agents. The superior graphics and physics simulations lets researchers focus on the underlying research goals and not as much on how to create an environment for their research. The first set of experiments simply provided a basis for comparison, which was useful in later experiments. The second set of experiments demonstrated the EAs ability to perform constrained optimization in the game environment. The final experimental set shows that the EA was able to use the useless genes in two important ways. First, evolutionary adaptation reduced the number points wasted in those traits. Second, the

[8] Nolfi S. and Floreano D. (2000). Evolutionary Robotics: The Biology, Intelligence, and Technology of Self-Organizing Machines. MIT Press/Bradford Books, Cambridge, MA. [9] Id Software. Quake III. Company web site at http://www.idsoftware.com/ [10] Walpole, R. E. and Myers, R. H. (1993). Probability and Statistics for Scientists and Engineers, Fifth Edition. Macmillan Publishing Company, New York.

2-369

Você também pode gostar