In early experiments, we were not very successful in using case-based reasoning principles. First, when the environment was changed by deleting a whole job, the results were not good enough to show the advantages of using CIGARs. We found that it was hard for the GA with CBR to tackle problems where the environment changes significantly. CBR is based on the idea that stored cases resemble a new situation. Since a job contains a series of different tasks, after deleting a whole job, good schedules for the current problem may be very different from the previous one for our encoding.
We also failed to get good results when we tried to save all, half, or a
quarter of the best individuals in the previous run ( ) and
insert them into the initial population for
. Saving and
re-using too many good individuals makes the GA quickly converge to
local optima. In other words, too many injected individuals tipped the
balance towards exploitation of the search space with not enough
exploration. We obtain enough exploration when we seed a small
percentage of the population, that is, use only the two best strings and
mutate the other six good intermediate strings from
.
We can explain these results if we think of solutions as forming the
leaves of a tree. Adapting one solution into another now means backing
up to a common parent and then moving down the second solution's
branch. Injecting cases closer to a common parent on the tree shortens
the path to a similar problem's solution since less backtracking is
involved. The individuals generated by a genetic algorithm form such a
tree with increasing schema order toward the leaves [Louis et al., 1993].
This first simple model thus indicates that increasing problem
dissimilarity implies injecting cases of lower fitness corresponding to
when the GA has fixed fewer positions in its population's genotypes
(lower order schemas). Lower fitness correlates with time (generation
number) and with a position closer to the root of our solution
tree. Thus, if we don't have a good measure of problem similarity, we
can inject a set of cases with different similarities (saved at
different times during the search attempt on ) and let
selection cull those cases that are not useful. This also explains why
our method of choosing individuals to be injected works well. Since, we
expect fitness to increase with time, choosing individuals from
intermediate generations allows us to inject individuals with varying
fitnesses into the the initial population. If
is very similar
to
injected high fitness individuals (high fitness with
respect to
) corresponding to those saved during later
generations will probably augment genetic search. Other injected
individuals will quickly die off. On the other hand if
is not
very similar to
individuals saved from earlier generations
will proliferate and help the search while others don't contribute as
much.
The genetic algorithm's robustness lets us use a much coarser
specification of problem distance. We only need to know that
is within a certain threshold distance from
, we do not need a
more precise estimate. This threshold is problem dependent but an
approximate distance measure is usually easier to obtain than an exact
measure. In any case, if none of the injected individuals turn out to be
useful, we would only have lost some time in evaluating these
individuals before they die out, and would not expect to be worse off
than starting with a completely random population.
These results broadly show that there is a tradeoff between performance, both in terms of speed and quality of solutions, and problem similarity. When more interested in quality, we should inject a small number of cases of varying quality into the initial population leading to a better balance between exploration and exploitation of the search space by the genetic algorithm. The CIGAR system always starts off better but the difference in performance usually decreases over time. The combined system should also be able to handle the injection of inappropriate cases -- selection will quickly cull these from the population. We believe that injecting a large number of cases of varying similarity may prove counter-productive. The randomly initialized component of the population provides a much need diversity which, if diluted, will probably lead to quick convergence to local optima. More work needs to be done to resolve this issue.