Next: About this document Up: CBR Assisted Explanation of Previous: Hypothesis Generation

Conclusions

We have tested the model on a series of problems including the DeJong functions, a pair of home-brewed deceptive problems, and the circuit design problem. Results show that our CBR tools are useful for understanding GA solutions in many different kinds of spaces as exemplified by our choice of test problems.

We have also tested a preliminary version of the hypothesis generator. Hypotheses generated by CBR tools can provide principled direction to otherwise less-productive GA search as well as help in the avoidance of deception. Although it is possible to damage the search through hypothesis injection, this risk can be reduced by meta-level application of CBR techniques. In the future, the hypothesis generator should track the progress of genetically engineered individuals and update its knowledge base. Even without meta-level knowledge, cautious injection of hypotheses proves profitable in some cases.

Combining GAs with CBR allows us to successfully attack the four problems listed in §1.1. Our system provides a tool with which GA researchers can explain discovered solutions while simultaneously discovering important aspects of the search space. The CBR module is able to make explicit the building blocks used by a GA. Analysis of the building blocks and the path taken by a GA through a search space provides a powerful explanatory mechanism. Results of the explanation can be used to guide post-processing in order to improve results in case of premature convergence or no convergence. The system also addresses some aspects of GA deception. The hypothesis module provides a mechanism by which deception can be avoided. If a space is found to be too deceptive for GA search, other search methods can be suggested based on the analysis of the space provided by the CBR module.

As well as providing a system for GA explanation and search space analysis, our system provides empirical evidence that the building block hypothesis is correct. As mentioned above, the hierarchical clustering of a set of GA individuals (distributed over several generations) according to fitness and genotype leads to nested schemas. Analysis shows that the subtrees with the most fit schemas receive the most reproductive energy while the least fit schemas are culled from the population. Although we knew that clustering on fitness and genotype should yield nested schemas (as a result of the BBH), we nonetheless tested several other clustering possibilities, including more generational information and parental information. Clustering on fitness and genotype was by far the most successful technique. The nested schemas that result from such a clustering provide a coherent index for the case-base.

The general ability to gain insight into a problem and the path a GA took while searching for its solution may widen the applicability of genetic algorithms to problems which were previously inappropriate by virtue of having relatively expensive evaluation functions. Because principled constraints can be applied to the search periodically, fewer fitness evaluations need be computed. Following confirmation that the case-base accurately models the space through testing, we can avoid expensive fitness function evaluation in favor of case-based estimation.

Acknowledgments

Our thanks to David Leake, Terry Jones, Jim Marshall and Melanie Mitchell who commented on early drafts of this paper. Also thanks to Gregory Rawlins whose GA seminar provided the impetus for this work.

References

Next: About this document Up: CBR Assisted Explanation of Previous: Hypothesis Generation

Sushil J. Louis
Wed Jun 25 14:10:37 PDT 1997