Research
The paper in progress
Simulator Code - Installation
GA Code
Effects Of Case Injection Towards Human Learning

The ability of the GA to maintain injected knowledge over time.
As the number of evaluations increase both the GA's without fitness biasing eventually lose injected information, settlnig on the optimum.
However with fitness biasing we can effectively maintain injected knowledge in all but 2 runs.

Different RC's produced by the GA playing the game without case injection, with, and with fitness biasing

Same Graph only just displaying the Rc's produced without injection

Same Graph only just displaying the Rc's produced with injection
Same Graph only just displaying the Rc's produced with injection and fitness biasing
Screenshots
A pair of platforms returning from a mission
A shot of the refinery polluting our precious air
A nice overhead shot of a case injected mission
Movies - Gen 4
Movie showing the situation
Movie showing the attack situation
Movies - Gen 3
Movie 1 - The intial scenario
Movie 2 - The ga's best allocation against the scenario
Movie 2 - A popup occurs, but no replanning happens
Movie 2 - Same popup, but the GA replans around it
A movie of me playing with a parameter to the router, namely the coef to multiply by a threats radius
The research revolves around techniques used by AI systems in a bround category
of games. It has to do with the broad spectrum of knowledge about a game, including
learning and teaching human opponents, coevolving against itself to form new
knowledge, and be used in decision assistance systems.
Take any game that can be resolved into the following set of components.
- A discrete set of actors (some entities that act or are acted on).
- Sets of actions for each actor to perform (walk, talk to, throw rocks at,
shoot self ).
- Non-infinite time interval (any game that has an eventual outcome).
- A comparable outcome (win/loss).
- Should be a repeated game to minimize stochastic effects.
- Game outcome should be mostly dependent on an initial plan (planning routing
for a RVS game instead of a quake melee).
This is translatable into a chromosome string that can be encoded into a ga.
Think of it as a mapping between actions and actors. If you assign a group
of actions to each actor, you can then give each action a set of bits determining
an enumeration of other actors. In essence you presuppose bob is going to talk
to someone, the GA has a set group of bits for that action which determine who
he talks too. If you include a null mapping (dont talk to anyone), and including
a ways of determing order (easy to blindly do, lots of easy optomizations though).
This is then parsable into an actual plan of actions for each actor, which
can then be run through the game. The game produces an outcome which is converted
into a fitness value for that plan, and the usual ga operators can continue.
If you can split the game into sides, each side can run a ga to produce a highly
effective plan versus any given enemy plan. Both sides can continue to update
there plan against there enemies, and hopefully continue to produce more and
more effective plans. This is a coevolutionary arms race. This leads to several
problesm though
- It is usually possible for your opponent to change plans in a game if things
are not going well.
- It is possible for both sides to specialize excessively against each others
plans, without learning any knowledge applicable to general game playing.
1) is counterable versus in game reallocation. In a game when the opponent
changes plans, spin off a set of sub problems onto other machines, which run
similarily to above only they are only playing on the new sitaution, they will
return a good course of action which you follow. Since the original plans that
leave open the most room for improvisation will do better here, the original
plan will still evolve successfuly, only it will evolve plans that are both
effective on there own and capable of changing to meet suprises.
2) Second is a traditional arms race. Tournament based selection works well
here (make candidate play against a selection of past enemy plans instead of
just the current best).
The game I am playing is one I wrote, its a strike force simulation. There
are two sides in direct opposition to each other, but with very different responsobilities.
- Red - Red's actors are a set of platforms(aircraft), and a set of assets(weapons)
loaded onto the platforms. It determines where each aircraft goes, and which
weapons it fires at which targets. Each weapon has different effectivenesses
against a variety of targets. Its general idea is to maximize the damage done
versus the most valuable targets as well as minimizing the risk to its platforms.
- Blue - Blue's actors are mainly stationary ground assets. It has a set of
targets, all of which have some value that blue is supposed to defend. It
also has a set of threats (AA installations, radar, etc). That it places in
order to best defend its targets. Targets are generally stationary, as are
the larger threats (large radar installations). It is at a disadvantage in
terms of information, red generally knows the location of the vast majority
of its forces. Its main idea is to allocate its moveable assets in order to
present the best defense versus red's attackers. It also has the capability
of in game changes, it can place defenses that will only go active in the
middle of a scenario, i.e. a mobile installation that doesn't activate until
attackers have already moved in above or behind it. This presents a huge set
of strategic options for blue, including ambushes and traps. Its goal is to
defend its targets, both by minimizing damage to them, and by destroying attackers.
GA Analysis
Encoding

Fitness Function

Fitness for the individuals is relative to the red team currently.
Fitness = damage done - damage recieved.
Damage Done = Sum of targets values * their probability of being destroyed.
Damage recieved = Sum of platforms values * their probability of being destroyed.
Probability of being destroyed = multiple of weapons fired ats effectivenesses * the chance of there firee surviving to that point in time.
Chance of surviving at t is 1 - the chance of being destroyed by time t.


To Do
- want results that show the ability of case injection to bias the search towards a more general answer.
- need a scenario with two possibilities.
- CASE 1 - platform goes between two threats in an obvious trap
- CASE 2 - platform goes around them in a safer way.
- graph the regular GA convergence to each plan.
- determine the average fitness of running those plan with popups
- graph the injected GA's convergence to each plan
- determine the average fitness of running those plans with popups
- hopefully 5 > 3 by a statistically significant measure.
- Need to do these actual things
- fix case injection to sushils method versus the hack. - all done
- add gen modified data to genome
- change extraction to take each update
- port to cluster - cluster is down as of 12-30.
- run on base problem 50 times - cluster down so running on min
- dump best mission each gen for 1.1.2
- skip, they are all equivelent, just take one the "median" one that goes between the threats.
- extract individuals to build up a database
- its extracting as is, should be interesting to see what it comes up with (since it throws away duplicates).
- run those individuals through again, this time with the popup to show decrease in fitness.
- just run the "median" plan through to find new fitness
- graph overall fitness versus previous
- withuot the set of them there is no point in this, just show the loss in fitness + the change in the outcome (damage to platform + much longer route).
- run ga again, this time with case injection
- we have cases, just need to run it again without the popup while injecting.
- need to find reasonable injection parameters.
- run a median through the popup, to show the loss of general fitness.
- compare percentage of runs that go inside / around for the two cases.
- run on different injection parameters -> show how they affect the percentage of case 2's
- this should produce a 3d graph
- x -> injection period
- y -> injection fraction
- z-> percentage of case 2's, this shows how the injection biases the resul