Sushil J. Louis
Genetic Algorithm Systems Laboratory
Department of Computer Science
Introduction
Genetic algorithms (GAs) are randomized parallel search algorithms that search from an initially randomly generated population of points [13,11]. Current genetic algorithm based machine learning systems use rules to store past experience to improve their performance over time [13,11] and [4,5,17]. However, many application areas, especially in the design domain, are more suited to a case-based storage of past experience [24,14] and [30,10,2]. 1 We propose and describe a system that uses a case-base (or an associative memory) as a long term knowledge store in a new genetic algorithm based design system that learns from experience. Results from the configuration design of parity checkers and adders applicable to problems from engineering design and architecture, show 1) that our system, with experience, takes less time to evolve solutions to new design problems produces better quality designs, 2) the evolved designs are more similar to each other, and 3) a simple syntactic similarity metric can result in this performance improvement thus avoiding the problem of defining a domain dependent indexing scheme.
Typically, a genetic algorithm randomly initializes its starting population so that the GA can proceed from an unbiased sample of the search space. Instead, periodically injecting a genetic algorithm's population with relevant partial solutions to similar previously solved problems can provide information (a search bias) that reduces the time taken to find a quality solution. Our approach borrows ideas from case-based reasoning (CBR) in which old problem and solution information, stored as cases in a case-base, helps solve a new problem [27]. In our system, the data-base, or case-base, of problems and their solutions supplies the genetic problem solver with a long term memory. The system does not require a case-base to start with and can bootstrap itself by learning new cases from the genetic algorithm's attempts at solving a problem.
The case-base does what it is best at - memory organization; the genetic algorithm handles what it is best at - adaptation. The resulting combination takes advantage of both paradigms; the genetic algorithm component delivers robustness and adaptive learning while the case-based component speeds up the system and provides a means for biasing the structure of generated solutions.
These performance gains imply that fewer evaluations are needed to reach a specified design solution quality and that the organization deploying this system builds a reusable knowledge base of designs. Not only do these stored designs represent captured intellectual capital and design knowledge, they can be used to help perform sensitivity analysis, and improve the performance of a genetic algorithm based design tool [22]. The tendency to evolve more similar designs has implications for the reuse of existing components and the quick design of new hardware functionality from the reuse of existing components. For example, if we lose some functionality (component failure) or require new functionality from existing hardware, we can
The Case Injected Genetic AlgoRithm (CIGAR) system presented in this paper can be applied in a number of domains from computational science and engineering to operations research. We concentrate on configuration design problems, specifically the evolutionary design of combinational logic parity and adder circuits.
The next section describes CIGAR, we then discuss related work and point out the difference between problem and solution similarity. Section 5 applies the system to the parity and adder problems highlighting issues addressed by this work. The last section presents conclusions and directions for further research.
The system
Similarity between problems and their solutions can be defined over the space of problems or the space of solutions. Classic CBR systems operate by defining similarity over the space of problems and hence are problem or domain dependent. Coming up with a (problem) similarity metric is therefore an important issue in such systems. Genetic algorithms are usually applied to ``poorly-understood'' problems where domain knowledge is scarce and we would therefore find it even more difficult to find a good problem similarity metric. Other work describes systems that combine genetic algorithm's and case-based memory for domains where a similarity metric exists [19,18].
This paper concentrates on the more realistic case where domain knowledge is lacking and describes a system that uses similarity metrics over the solution space. For the problems in this paper with genetic algorithms solutions encoded as bit strings, we show that simple hamming distance suffices as our solution similarity metric for case-injected genetic algorithms.
When operating on the basis of solution similarity, CIGAR makes the assumption that similar solutions must have come from similar problems. 2 In this scenario, shown in Figure 1, we periodically inject a small number of solutions similar to the current best member (candidate solution) of the GA population into the current population, replacing the worst members. We call this the ``closest to the best'' strategy, one of several possible injection strategies. The GA continues searching with this combined population.
In our system, every individual generated during genetic search can be a case. Even a population size ofAs stated earlier, for the (traditional) binary genetic algorithm encoding of possible circuits, we use hamming distance as our similarity metric. What happens if our similarity measure is noisy and/or leads to unsuitable retrieved solutions? By definition, unsuitable solutions will have low fitness and will quickly be eliminated from the GA's population. CIGAR may suffer from a slight performance hit but will not break or fail - the genetic search component will continue making progress toward a solution.
One early attempt at reuse with search can be found in Ackley's work with SIGH [1]. Ackley periodically restarts a search in an attempt to avoid local optima and increase the quality of solutions. Eshelman's CHC algorithm, a genetic algorithm with elitist selection and cataclysmic mutation, also restarts search when the population diversity drops below a threshold [9]. Other related work includes Koza's automatically defined functions [16] and Schoenauer's constraint satisfaction method [28]. More recently, Ramsey and Grefenstette come closest to our approach and use previously stored solutions to initialize a genetic algorithm's initial population and thus increase a genetic algorithm's performance in an anytime learning environment that changes with time [26,12]. Automatic injection of the best solutions to previously encountered problems biases the search toward relevant areas of the search space and results in the reported consistent performance improvements. Sheppard and Salzburg combined memory-based learning with genetic algorithms in finding better plans for the pursuer-evader game within a reinforcement learning framework [29]. Their experiments indicate that the combined approach performed better than either approach alone. To our knowledge, the earliest work in combining genetic algorithms and case-based reasoning was done by Louis, McGraw, and Wyckoff who used case-based reasoning principles to explain solutions found by genetic algorithm search [22]. Preliminary work in this area solved pairs of similar problems with a clear performance advantage for the combined system [19]. Work that uses case-based reasoning in extracting design patterns, short term memory in genetic programming, Robert Reynold's cultural algorithms, and circuit design that may be relevant are [25,15] and [3,31].
These approaches only attack a single problem not a related class of problems. Moreover, these approaches do not relate problem/solution similarity to the quality of injected solutions and performance - a fundamental result of this paper.
Work on multitask learning suggests, as we do, that it is easier to learn many tasks at once, rather than to learn these same tasks separately [7,8] and [6]. Multitask learning can be applied to clusters of related tasks in parallel or in sequence and provides an inductive bias that often leads to better generalization performance on the tasks. While MTL addresses generalization, we lean toward improving search performance - in design domains, a mapping from function to structure - for sequential (design) tasks. Parallel genetic algorithms using the island model, provide one possible avenue toward information exchange on related tasks in parallel for genetic algorithm based machine learning systems. This is an area to be further explored.
Lifelong learning seeks to address the problem of knowledge transfer between related tasks in the context of learning control algorithms for robotics [32,33]. The authors believe that knowledge transfer is essential for scaling robot learning algorithms to more realistic and complex domains. Lifelong learning differs from our approach in that they are interested in an incremental learning situation seeking to build behavioral complexity with experience. Work by Louis provides strong support for the CIGAR approach to control problems in robotics thus indirectly and independently providing more evidence for the utility of lifelong learning [21].
What do we mean by problem similarity? What is the difference between problem similarity and solution similarity? The next section provides concrete answers to these questions in the context of combinational logic design.
Indexing, or how we define similarity, is a basic issue in all the systems described above. Previous work has dealt with the simpler case where a similarity metric exists in the problem space [20,26]. Simply put, when we know that two problems are similar, the system can use information from attempting one problem to improve performance on the other. However, a problem similarity metric is not easy to come by and remains a major issue for case-based reasoning systems. Genetic algorithms are supposed to be suitable for ``poorly-understood'' problem, this implies that defining a problem similarity metric over a poorly-understood problem domain will be difficult or impossible. In this paper, we study the more realistic case where we do not need a similarity measure on the problem space. Since genetic algorithms solutions are encoded as binary or real strings, purely syntactic similarity measures lead to surprisingly good performance - a needed property for applications to poorly-understood systems.
Combinational circuit design is an example of a configuration design problem. The general problem can be stated as: Given a function and a target technology to work within, design an artifact that performs the function subject to constraints. For adders, the function is addition and the target technology is combinational logic gates such as boolean AND and OR gates. For parity checkers the function is parity checking and it uses the same set of logic gates. A genetic algorithm can be used to design combinational circuits as described in [23].
A five-bit parity checker is one of
different five-input one-output combinational circuits (boolean
functions). If we are trying to solve this class of problems, one way
of indexing (defining similarity of problems) can be as follows:
Concatenate the output bit for all possible 5-bit input combinations
counting from
through
in binary. This results in a binary
string of output bits,
, of length
. Strings that are one
bit different from
define a set of boolean functions, as do
strings that are two bits different and so on. This way of naming
boolean functions provides a distance metric and indexing mechanism
for the combinational circuit design problems that we consider in this
paper. That is, we have a metric for measuring problem
similarity.
Problems are constructed from the parity problem by randomly changing
bits in the output bit string. The fitness of a candidate circuit is
the number of correct output bits. Thus
-bit candidate parity
checkers would have a maximum fitness of
.
As a simple example, consider the 3-bit parity checker shown in
Table 1. The first column in Table 1
shows all possible inputs for a 3-bit parity checker, the second column
shows the correct output for each input. We construct a 3-input,
1-output problem that is similar to the 3-bit parity checker by
randomly choosing and flipping one of the output bits of the 3-bit
parity checker as shown by the boxed
in the table. The correct
output bits for the new problem (1 bit different from the 3-bit parity
checker) are shown in the third column. This new problem is distance one (1)
away from the parity problem in terms of our problem similarity
metric.
|
For the
-bit adders considered in this paper, we can define an
equivalent similarity metric. Concatenate the output bits for all
possible pairs of 2-bit input combinations. This is equivalent to the
possible combinations of
bits. Since each of these input
combinations has a 3-bit output, we get a total of
length output string,
. Once we define the correct
-bit
output string for a 2-bit adder, we can construct similar problems by
randomly changing bits in the output bit string. A 2-bit adder is
therefore one of
different 4-bit input, 3-bit output
combinational logic circuits. The fitness of a candidate circuit is
also the number of correct output bits - if a circuit gets the right
output for all possible inputs it is a correct circuit. Thus
-bit
input,
-bit output problems (including the
-bit adder) would
have a maximum fitness of
.
Since a solution similarity metric depends on how solutions/circuits are encoded, we describe our representation - a traditional genetic algorithm binary string. An individual in the GA's population encoded as a bit string maps to a two-dimensional circuit as shown in figures 2 and 3. We use four (
The solution similarity metric needs to measure the distance between encoded circuits (solutions). We are assuming that similar circuits have similar function. Since we use a binary string to encode combinational circuits, the hamming distance between these bit strings suffices as our similarity metric on this problem. We show that this simple similarity metric works well for these combinational logic design problems. Real number strings could use Euclidean distance.
For both problems we developed problem generators to produce
problems that were similar - the degree of similarity was
configurable. We configured these generators to produce a sequence of
similar problems - this number was at the limit of our
computational capacity in obtaining reasonable turnaround times.
Using the parity problem as a basis, we generate
problems that
are similar in terms of our problem similarity metric by randomly
choosing and flipping
(uniform distribution) of the output
bits of the 6-bit parity checker's
-bit output bit string.
CIGAR uses a population of size
run for a maximum of
generations to solve these
-bit combinational circuit design
problems similar to parity checkers. This results in a
length
chromosome (6 rows, 5 columns, 5 bits per gate) with a corresponding
search space of
. Note that the search space remains the same
for all
problems. The probability of crossover is
and the
probability of mutation is
.
In all cases, we replace the worst individuals in the population with
the individuals chosen by the injection strategy. This paper use the
probabilistic closest to the best injection strategy where the
probability of being chosen for injection is directly proportional to
similarity (inversely proportional to hamming distance). We use the
roulette wheel method described in Goldberg to implement this
probabilistic strategy [11]. We chose the injection
interval, the number of generations between injections, to be
where
is the population size. This formula
reflects the takeover time when using CHC selection [9].
We inject three cases into the population (
of the population
size) every
generations. Previous work
explains these parameter values [19]. All plots are averages
over ten (10) runs.
Figure 4 compares performance in terms of time taken to find a solution for CIGAR and a randomly initialized genetic algorithm that uses the same GA parameters. The figure plots time taken to find the best solution on the vertical axis and the number of problems solved on the horizontal axis.
The Randomly Initialized Genetic Algorithm (RIGA) uses the same set of parameters. Figure 5 plots the fitness of the best solution found on the vertical axis and number of problems solved on the horizontal axis. These figures show that CIGAR takes less time to produce better quality solutions as it gains experience from attempting more problems - CIGAR learns from experience.Figure 6 shows average fitness across all problems as a function of the number of generations. CIGAR, as expected, does much better than the randomly initialized genetic algorithm.
We look at the similarity among solutions produced by CIGAR by showing the frequency distribution of hamming distances among solutions. Figure 7 shows that CIGAR has significantly more solutions that are closer in hamming distance (more similar) than the randomly initialized genetic algorithm. The inset highlights the difference.
Another way of looking at the hamming similarity among solutions is to calculate the average hamming distance position by position along the chromosome. Figure 8 plots the average hamming distance at each position along chromosome for CIGAR and RIGA. The figure shows that the hamming distances at many positions along the chromosome are much lower for CIGAR. CIGAR solutions tend to be more similar.
Using the adder problem as a basis, we randomly generate
problems
that are similar in terms of our problem similarity metric by randomly
flipping
(16 - 33 percent) of the output bits of the 2-bit
adder's
-bit output string. CIGAR uses a population of size
run for a maximum of
generations to solve
-bit input,
-bit
output combinational circuit design problems that are similar to
adders. With our representation, the adder problem tends to be more
difficult for the genetic algorithm because of the carry bit. We use a
length chromosome (4 rows, 5 columns, 5 bits per gate) with a
corresponding search space of
. We inject
cases (
of population size) into the population every eight generations.
Figure 9 compares performance in terms of time taken to find a solution for CIGAR and a randomly initialized genetic algorithm. The figure plots time taken to find the best solution on the vertical axis and the number of problems solved on the horizontal axis.
|
Figure 11 shows average fitness across all adder-like problems as a function of the number of generations. CIGAR, as expected, does much better than the randomly initialized genetic algorithm. Note that one of effects of periodic injection is to temporarily reduce the fitness immediately after the injected generation.
Figures 12 and 13 plot the distribution of similarities among solutions and along chromosome positions for the chromosomes produced by CIGAR and the randomly initialized GA.
Once again, the figures show the same qualitative results. Figure 13 shows that the hamming distances at many positions along the chromosome are much lower for CIGAR. CIGAR solutions tend to be more similar.Conclusions and Future Work
We described a system that combines a genetic algorithm with case-based reasoning and showed that the case-injected genetic algorithm (CIGAR) learns from experience to improve performance. We used the combinational logic design of circuits as a scalable, well-understood, test problem.
CIGAR makes the assumption that similar problems have similar solutions. According to the schema theorem, genetic algorithms process syntactic string similarities and we have shown that storing and injecting syntactically similar solutions leads to increased performance with experience. We also showed that the simple syntactic similarity measure of hamming distance works well on combinational circuit design problems. It is important to note that we need to save (and re-use) to the case-base, not just the best or final solution to the problem at hand, but also intermediate solutions that the GA generates while working on a problem. We also find that the more similar the problems the larger the difference in performance between CIGAR and RIGA.
When comparing similarities among solutions produced by CIGAR and a randomly initialized GA, we find that CIGAR solutions tend to be more similar and share alleles at many locii on the chromosome. This provides evidence for the hypothesis that a case-injected genetic algorithm is a useful tool for evolutionary hardware. When there are multiple designs that realize the same functionality, we can use CIGAR to bias genetic search to quickly evolve (hardware) design that are similar to other designs supported by underlying components. This has implications for fault tolerance, quick reconfiguration, and component reuse.
We are currently investigating the effect of different injection strategies, analyzing the similarities and differences in evolved circuits, and in parallelizing the code.
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 1 ehc
The translation was initiated by Sushil Louis on 2004-11-20