Preliminary work on hypothesis generation focuses on use of the heuristics sketched in the Methodology section (§5). The convex hull of each subtree selected by the method outlined there contains a very large number of possible individuals which have never been evaluated. The hypothesis module picks some specific individuals from within the hull of the most promising subtree and instantiates them for participation and evaluation within the GA population. Only a small number of generated individuals are injected in a given generation.
Since individuals which fit in the reduced hull of the elected schema (discussed in §5) may be considered more prototypical of the subtree than those which do not, the elected schema provides a better template for the generator than does the regular schema. As will be explained, hypotheses are in fact chosen from this reduced hull.
Our motivations for including a hypothesis generator are multifold. If hypothesis-produced individuals prove to be better on average than GA-produced individuals, the GA will be accelerated. They also provide evidence that produced solutions are reasonably well understood. Further support is furnished by the identification of building blocks.
Finally, consider cases where a small but strong cluster is discovered but quickly swamped out by outside evolutionary pressures. Injection of some new members of the ``lost'' subspace into the population would seem wise, as the undersampled space may deserve further investigation.
To assess the ability of the hypothesis generation module to enhance
the system, we compared the overall performance of the GA without
hypothesis injection against the performance of a GA with hypothesis
injection, both averaged over ten runs. The baseline case was a GA
running with a generation gap, where only
of the
population is replaced each generation by new individuals. In a GA
system with a generation gap, individuals may survive in the
population for many generations, possibly being used for crossover
more than once. Note that cases are only constructed for new
individuals.
Since the case-adder module is not yet implemented, it was not possible to incrementally maintain the case-base, so injections in the experiments were perforce punctuated. In the first experiment, 1200 individuals were evaluated in total, with case-bases created in the same manner as before, but for three disjoint sets of 400 individuals each. The first case-base was created as soon as the first 400 initial cases were accumulated. Instead of allowing crossover and mutation to provide the partial population replacement in the immediately following generation, however, the hypothesis generator was employed. Following the application of our set of heuristics, which identify the most promising subspace, individuals from within the reduced hull (of elected schema) were generated at random and injected into the population. This procedure was repeated for the second set of 400 initial cases when they became available, and a final case-base for user interaction was created at the end of the run.
The second experiment had the same structure as the first, but for the hypothesis generation copies of previously evaluated individuals that fit within the regular schema of the subspace were selected at random and drawn into the reduced hull by flipping all rogue alleles. These massaged individuals retain the free alleles (those not fixed in the elected schema) of their originals and so include less variation than those produced for the first experiment. In the first experiment, the free alleles are set at random.
In each of the solution spaces searched by the GA, hypothesis injection on average provided detectable and reproducible improvements. Even though injection was performed only twice per experiment, hamming distance to the optimal solution was decreased by more than ten percent of the problem size over the baseline case. In several instances, the solution was discovered within one generation of the injection. A GA converges when the average hamming distance between individuals in a population stabilizes. In both experiments the GA converged more quickly and to better solutions when hypothesis injection was used.
The speed improvement was most pronounced in the second experiment, though there was also a smaller variance in the injected individuals, and the final solution was not as notably improved.
Overall, the best solutions after 1200 fitness evaluations were products of the first experiment. There, injected individuals sampled promising subspaces with sufficient breadth to eventually locate superior solutions. Note that the abstract case defining the chosen subspace may include data from previous generations. As the injected individuals were prototypical of that case (from which the population may have since wandered) they may markedly increase the population's diversity. This diversity prevents convergence only temporarily. Following exploration, the superior individuals proliferate.
Clearly, hypothesis generation can be implemented in a more reasonable way once the case-adder module is in place. Much cleaner experimentation could be run given a continuously maintained case-base. Injection of hypotheses could be done in smoothly varying rates better tailored to the state of the search.
The hypothesis module should track the performance of its injected individuals and adjust its hypotheses accordingly. The successes and failures of a genetically engineered individuals can be used to make decisions regarding the validity of recent hypotheses and the rate at which they are to be injected.
If a selected promising subspace were in fact small, the generator could relax the constraint that injection candidates be drawn into a reduced hull. In such cases, the generator could choose to inject some individuals from some distance outside the hull.
The hypothesis generator should also take into account other promising areas of the search space. By discovering the important building blocks in a space (perhaps in different subtrees) and combining them, the hypothesis module could speed the search considerably. Though higher level abstract case schema are unlikely to have fixed bits, our elected schema may have quite a few. By adjusting the thresholds used to compute the elected schema we may be better able to identify building blocks. After producing a hypothesis as in the initial experiments, we could then alter it to include the hypothesized building block of some other strong subtree, which may have been explored at a different time.
As very strong but suboptimal building blocks may have swamped the population quickly, any final solution containing building blocks which were evident particularly early in the evolution could be genetically engineered to check for parasitic alleles.
Concerning the case-adder itself, we propose placing new cases in the case-base as special children of the closest subtree identified by our methodology. The initial classification task could therefore be performed by a variety of existing algorithms. When the weight of a subtree became excessive, we would invoke the clustering algorithm only on the leaves of this subtree. This limits the cost of reorganization and allows continuous use of the first-level classifier. When a new case was assigned to the reclustered tree, an additional level of classification would be performed. To save space, we could discard some of the lower level parts of a subtree since the abstractions being used have already been computed.