next up previous
Next: Further work Up: Implementation in C Previous: Genetic programming made

Results

  
Figure 3: Effects of changing the population size on overall performance.

The problem we chose to attack is described in Koza, Chapter 10, ``Symbolic Regression --- Error-Driven Evolution.'' We selected to emulate the results of the symbolic regression example involving creation of arbitrary constants --- basically, curve fitting. This problem demonstrates how not only how the genetic programming paradigm works, but specifically, how constants can be created by ``survival of the fittest.'' We used the same quadratic curve that Koza used to learn, and over the same interval.

Curve fitting is easy to visualize, and progress can be easily tested, by printing out the trees as the run is underway. In order to facilitate the generation of appropriate coefficients, we use the randomized terminal constants suggested by Koza. The genetic process will indeed combine these constants in the proper manner.

The progress over generations needed to be observed, so that we could monitor the overall improvement in the population. Koza provides various sets of possible statisitics to use, but these are in general more cryptic than necessary. We decided to look at average performance of the population as a whole as a good measure. By providing each program with twenty points over a closed interval, we defined a ``hit'' to be a member of the population which correctly estimated the curve over all points, with an error of 1%. So the maximum, average hits possible is twenty.

The two main operators were implemented according to Koza, and we tested two pertinent parameters: the proportion of reproduction and crossover, and the size of the population. Figures 3 and 4 clearly depict the effects of changing these two parameters.

  
Figure 4: Effects of changing the proportions of crossover/reproduction on overall performance.

In figure 3, we see that by changing the population size, the benefits of increased diversity do indeed increase the performance over generations. For example, for a population size of 100, we can see that the average hits measure has already leveled off, and is indeed randomly meandering. This behavior is expected, because, as was previously mentioned, the population should never converge to a uniform population. Note that these results are averaged over 50 runs of the simulation. We notice that as the population size increases to 1000, the curve of average hits is still increasing, and has not yet reached its maximum.

Figure 4 also displays expected behavior. By decreasing the genetic diversity (increasing the percentage of reproduction relative to the precentage of crossover), we can see that the performance suffers. This behavior most proabably is the result one or two suboptimal programs dominating the genetic pool, which prevents the population from acheiving higher average results.

We can see the process of adaption in action in Figure 5. The first seven generations are displayed, along with the actual function it is trying to fit. The best of each generation is plotted, and it is plainly visible that the curves are indeed approaching the actual curve.

  
Figure 5: Best of generations 1,3,5,7 and the actual function.



next up previous
Next: Further work Up: Implementation in C Previous: Genetic programming made