Research

The paper in progress

Simulator Code - Installation

GA Code

Effects Of Case Injection Towards Human Learning

The ability of the GA to maintain injected knowledge over time.

As the number of evaluations increase both the GA's without fitness biasing eventually lose injected information, settlnig on the optimum.

However with fitness biasing we can effectively maintain injected knowledge in all but 2 runs.

Different RC's produced by the GA playing the game without case injection, with, and with fitness biasing

Same Graph only just displaying the Rc's produced without injection

Same Graph only just displaying the Rc's produced with injection

Same Graph only just displaying the Rc's produced with injection and fitness biasing

Screenshots

A pair of platforms returning from a mission

A shot of the refinery polluting our precious air

A nice overhead shot of a case injected mission

Movies - Gen 4

Movie showing the situation

Movie showing the attack situation

Movies - Gen 3

Movie 1 - The intial scenario

Movie 2 - The ga's best allocation against the scenario

Movie 2 - A popup occurs, but no replanning happens

Movie 2 - Same popup, but the GA replans around it

A movie of me playing with a parameter to the router, namely the coef to multiply by a threats radius

The research revolves around techniques used by AI systems in a bround category of games. It has to do with the broad spectrum of knowledge about a game, including learning and teaching human opponents, coevolving against itself to form new knowledge, and be used in decision assistance systems.

Take any game that can be resolved into the following set of components.

This is translatable into a chromosome string that can be encoded into a ga.

Think of it as a mapping between actions and actors. If you assign a group of actions to each actor, you can then give each action a set of bits determining an enumeration of other actors. In essence you presuppose bob is going to talk to someone, the GA has a set group of bits for that action which determine who he talks too. If you include a null mapping (dont talk to anyone), and including a ways of determing order (easy to blindly do, lots of easy optomizations though).

This is then parsable into an actual plan of actions for each actor, which can then be run through the game. The game produces an outcome which is converted into a fitness value for that plan, and the usual ga operators can continue.

If you can split the game into sides, each side can run a ga to produce a highly effective plan versus any given enemy plan. Both sides can continue to update there plan against there enemies, and hopefully continue to produce more and more effective plans. This is a coevolutionary arms race. This leads to several problesm though

  1. It is usually possible for your opponent to change plans in a game if things are not going well.
  2. It is possible for both sides to specialize excessively against each others plans, without learning any knowledge applicable to general game playing.

1) is counterable versus in game reallocation. In a game when the opponent changes plans, spin off a set of sub problems onto other machines, which run similarily to above only they are only playing on the new sitaution, they will return a good course of action which you follow. Since the original plans that leave open the most room for improvisation will do better here, the original plan will still evolve successfuly, only it will evolve plans that are both effective on there own and capable of changing to meet suprises.

2) Second is a traditional arms race. Tournament based selection works well here (make candidate play against a selection of past enemy plans instead of just the current best).

The game I am playing is one I wrote, its a strike force simulation. There are two sides in direct opposition to each other, but with very different responsobilities.

  1. Red - Red's actors are a set of platforms(aircraft), and a set of assets(weapons) loaded onto the platforms. It determines where each aircraft goes, and which weapons it fires at which targets. Each weapon has different effectivenesses against a variety of targets. Its general idea is to maximize the damage done versus the most valuable targets as well as minimizing the risk to its platforms.
  2. Blue - Blue's actors are mainly stationary ground assets. It has a set of targets, all of which have some value that blue is supposed to defend. It also has a set of threats (AA installations, radar, etc). That it places in order to best defend its targets. Targets are generally stationary, as are the larger threats (large radar installations). It is at a disadvantage in terms of information, red generally knows the location of the vast majority of its forces. Its main idea is to allocate its moveable assets in order to present the best defense versus red's attackers. It also has the capability of in game changes, it can place defenses that will only go active in the middle of a scenario, i.e. a mobile installation that doesn't activate until attackers have already moved in above or behind it. This presents a huge set of strategic options for blue, including ambushes and traps. Its goal is to defend its targets, both by minimizing damage to them, and by destroying attackers.

GA Analysis

Encoding

Fitness Function

Fitness for the individuals is relative to the red team currently.

Fitness = damage done - damage recieved.

Damage Done = Sum of targets values * their probability of being destroyed.

Damage recieved = Sum of platforms values * their probability of being destroyed.

Probability of being destroyed = multiple of weapons fired ats effectivenesses * the chance of there firee surviving to that point in time.

Chance of surviving at t is 1 - the chance of being destroyed by time t.

To Do

  1. want results that show the ability of case injection to bias the search towards a more general answer.
    1. need a scenario with two possibilities.
      1. CASE 1 - platform goes between two threats in an obvious trap
      2. CASE 2 - platform goes around them in a safer way.
    2. graph the regular GA convergence to each plan.
    3. determine the average fitness of running those plan with popups
    4. graph the injected GA's convergence to each plan
    5. determine the average fitness of running those plans with popups
    6. hopefully 5 > 3 by a statistically significant measure.
  2. Need to do these actual things
    1. fix case injection to sushils method versus the hack. - all done
      1. add gen modified data to genome
      2. change extraction to take each update
    2. port to cluster - cluster is down as of 12-30.
    3. run on base problem 50 times - cluster down so running on min
    4. dump best mission each gen for 1.1.2
      1. skip, they are all equivelent, just take one the "median" one that goes between the threats.
    5. extract individuals to build up a database
      1. its extracting as is, should be interesting to see what it comes up with (since it throws away duplicates).
    6. run those individuals through again, this time with the popup to show decrease in fitness.
      1. just run the "median" plan through to find new fitness
    7. graph overall fitness versus previous
      1. withuot the set of them there is no point in this, just show the loss in fitness + the change in the outcome (damage to platform + much longer route).
    8. run ga again, this time with case injection
      1. we have cases, just need to run it again without the popup while injecting.
      2. need to find reasonable injection parameters.
    9. run a median through the popup, to show the loss of general fitness.
      1. compare percentage of runs that go inside / around for the two cases.
    10. run on different injection parameters -> show how they affect the percentage of case 2's
      1. this should produce a 3d graph
        1. x -> injection period
        2. y -> injection fraction
        3. z-> percentage of case 2's, this shows how the injection biases the resul