Given a 100 bit combination lock, it would take a very long time to find the correct combination by using brute force on every single possible lock combination. However, given a "black box" that tells us how close we are to the correct combination, we can discover ways to get close to or to find the correct combination of bits.
To attack this problem, I implemented a Genetic Algorithm similar to the one in Chapter 1 of Genetic Algorithms in Search, Optimization & Machine Learning written by David E. Goldberg. In this approach, I first generated a pool of candidate solutions at random.
From there, I ran the provided "black box" function called "eval(...)" on each of the 100 bit strings to find their fitness value. Their fitness values were totaled up, and a percentage of fitness for each parent gene was found. Based on a gene's fitness, it was mated with other genes with high fitness to create a new pool of cross over genes. These genes contained part of each parent gene by selecting a random point in the genes between 1 and length - 1 and merging at that point.
Once the genes were mated and a new gene pool was created, every 100 bit string in the mating pool was ran though a mutation function which would statistically determine if a random bit in the gene should be flipped. This percentage number should be low, since mutation is used to prevent loss of genetic material from mating and crossover.
This process repeats for a specific number of generations, after which the best string found is printed out.
For each data set, I ran 1000 generations with a gene pool of 1000 genes. The mutation rate was set to a very low probability of 0.01. Higher mutation rates took a long time to reach a good solution.
The result of using the eval function in the "eval1linux.o" object resulted in a best fitness of 101 from the 100 bit string below:
1010101010111111111111111111111111111111111111111100000000000000000000000000000000000000000000000000
The "evallinux.o function had a best fitness of 100 for which the 100 bit string below created:
1111111111111111111111111111111111111111111111111100000000000000000000000000000000000000000000000000Generation by generation data is available at the following links: [evallinux raw data][eval1linux raw data]
This algorithm ran very quickly, as it only took 20 seconds for 1000 generations to run on a dual Pentium III 733Mhz. Though a 1000 generations were ran, the correct solution was found after about 250 generations with both evallinux and eval1linux functions. The implementation was simple and strait-forward to create. Since the most fit combinations of strings were used to create each new generation, the correct solution would rapidly become apparent.
The downside to using a GA is that a good solution may be found, but the best solution won't. In the case of my algorithm, the best solution was found. For a more complicated problem with fewer clock cycles, a good solution would be found, but it may not be the best.