Technical Objective:
            To compare the performance of the  rainbow text classifier with a neural network and decision tree software to determine if, just using features of the text as input, they can have performance similar to or greater than that of rainbow.

 Plan of Action:
04/14/03 Familiarize ourselves with the bow library.
04/16/03 Research the different classification schemes that it uses.
04/18/03 Extract document features from rainbow
04/21/03 Modify neural network to use features and output classifications
04/21/03 Begin presentation design 04/23/03 Interpret results - if bad classification then try other feature methods
04/25/03 Finish presentation

Work Completed:

1.      Created model of the 20 newsgroups that comes with the rainbow software

2.      Created scripts to create input data sets for neural network and C4.5 software

a.       Translates output of –print-matrix into usable form

3.      Ran C4.5 software on input data set to create a classification decision tree

1.      Modified neural network code to accept 1000 words with 20 output nodes.

2.      Used document data from scripts as input to neural network

3.      Analyzed results and compared

Work that needs to be done:

4.      Write final document

Links:

Script used to generate test.dat and train.dat (needs next script)

Script used to generate the NN data (needs rainbow and a model)

20 newsgroup data

Presentation