Although progress of a species through a fitness landscape is not discussed in the standard GA literature, in theoretical biology there is relevant work in the related field of molecular quasi-species [7, 6]. In particular, analysis of the `error catastrophe' shows that, subject to certain conditions, there is a maximum rate of mutation which allows a quasi-species of molecules to stay localised around its current optimum. Selection and mutation are opposing forces, the former tending to increase numbers of the fittest members of the population, while the latter tends to drag offspring down in fitness away from any local optimum. A zero mutation rate allows for no further local search beyond the current species, and other things being equal increased mutation rates will increase the rate of evolution. Hence if mutation rates can be adjusted, it would be a good idea to use a rate close to but less than any critical rate which causes the species to fall apart. A further possibility, in the spirit of simulated annealing, is to temporarily allow the rate to go slightly above the critical rate -- to allow exploration -- and then cut it back again to consolidate any gains thus made.
For an infinite asexual population, in the particular context of molecular evolution, Eigen and Schuster show [7] that these forces just balance for a mutation rate
where l is the
genotype length and
is the superiority parameter of the
master sequence (the fittest member of the population) -- the factor by
which selection of this sequence exceeds the average selection of the rest of
the local fitness landscape, and hence the rest of the population.
The diagrams they show for the very sharp cutoff at the
critical rate refer to a fitness landscape with a single `needle' peak for the
master sequence, taking all the rest of the population to be equally (un-)fit;
where the hill slopes more gently from the master sequence, the cutoff is less
abrupt. For typical values of
between 2 and 20, the upper limit of
mutation before a quasi-species `loses its grip' on the current hill would be
between 0.7/l and 3/l. For finite population size, there is some reduction
in this critical mutation rate (the `error
threshold') [8], but for genotypes of length order 100,
and populations of size order 100, the error threshold will be extremely close
to that for an infinite population.
Since it is the natural logarithm of
which enters into the equation
for m, variations in
of an order of magnitude do not affect
(and hence the error threshold) as significantly as variations in
genotype length. In conventional GAs, choice of mutation rates tends to be a
low figure, typically 0.01 or 0.001 per bit as a background operator, decided
upon without regard to the genotype length. The SAGA framework means that
mutation rates of the order of 1 per genotype are required when using linear
rank selection or tournament selection as discussed below, subject to some
qualifications concerning recombination and `junk DNA'. These qualifications
tend to increase the recommended rate to somewhere in the range 1 to 5
mutations per genotype, the idiosyncratic nature of fitness landscapes for
different problems making it difficult to be more specific.
When applying such mutation rates in a GA, it is essential that the probability of mutation is applied independently at each locus on the genotype. This gives a binomial distribution (approximating a Poisson distribution for long genotypes) for the number of mutations per string, so that genotypes with an expected m mutations have this as the average value with a wide variance (including the possibility of zero mutations).