Assessed Programming Project

You should submit a listing of your program along with a short report (maximum 2000 words).

Your program should implement a 3 layer Artificial Neural Net (ANN), with 3 Input nodes, 3 Hidden nodes and 2 Output nodes. (Try to write your code so that the size of the network could be changed easily.) The Hidden nodes each receive weighted inputs from all of the previous Input layer, plus a bias; the Output nodes likewise from the Hidden layer. Sigmoid transfer functions [ 1/(1+e^(-x)) ] should be used at nodes where appropriate. Using this ANN code, two separate training methods should then be implemented, for training the weights and biases on any set of Input/Output training examples:-

Firstly, a separate training part of the program should be written such that this ANN can have its weights and biases trained by back-propagation (or a variant).

Secondly, you should write an alternative Genetic Algorithm training method for finding suitable values for all the weights and biases of the same network. Appropriate methods for encoding all the weights and biases on the genotype should be used, and a suitable fitness function designed.

You should then use independently each training method, backprop and GA, on a version of the 3-bit parity problem. Here the 3 inputs can be any combination of 0s and 1s, and the desired target output of the first Output node is (as close as possible to) 0.0 when there is an even number of input 1s (i.e. none or any two of the three inputs are 1) and 1.0 otherwise (i.e. 1 or 3 of the three inputs are 1); the desired target for the second Output node is the opposite (1.0 for even, 0.0 for odd).

Each training method, backprop and GA, should be optimised as far as possible, and then a comparison drawn between performance with the two methods. Is this problem more appropriate for one method than the other? If you wish to experiment further (optional) try a 4-input version of the problem, using three or four hidden nodes, and see how this affects the difficulty.

NB: this link to a note on Generalisation is relevant.

If you were to adapt your code to a problem that required generalisation, what changes/extensions would you make to it? Include a short comment on this in your report.