Genetic Algorithms and Artificial Neural Networks
A talk presented to the Fort Worth, Texas chapter of the Association for Computing Machinery, Spring 1988
Copyright 1988, 1992 by Wesley R. Elsberry
What is a genetic algorithm?
In searching a large state-space, multi-modal state-space, or n-dimensional surface, a genetic algorithm may offer significant benefits over more typical search or optimization techniques (linear programming, heuristic, depth-first, breadth- first, praxis, DFP [De Jong, 80]). Of course, 'boy-with-a- hammer' syndrome should be avoided.
Genetic algorithm components
Why does this work?
State goal condition as finding binary string of length 5 with (4) 1's
Randomly generate L5 strings (length five), population size of 5
00010 (eval: 1)
10001 (eval: 2)
10000 (eval: 1)
01011 (eval: 3)
10010 (eval: 2)
Population evaluation average: 1.8
To find next generation, reproduce from this candidate population with modification. Modification method is defined as 'crossover', as in sexual reproduction.
Modification methods thus far proposed for GA's:
Selectionist distribution:
1 00010 (eval: 1)
2 10001 (eval: 2)
3 10001 (repeat since fitness is higher)
4 10000 (eval: 1)
5 01011 (eval: 3)
6 01011 (repeat)
7 01011 (repeat)
8 10010 (eval: 2)
9 10010 (repeat)
Select pairs (indices from selectionist distribution):
1 & 4 Then crossover point: 1
4 & 5 4
9 & 7 3
8 & 6 1
7 & 5 1
Result from :
1+4:1 = 00000 (eval: 0)
4+5:4 = 10001 (eval: 2)
9+7:3 = 10011 (eval: 3)
8+6:1 = 11011 (eval: 4)
7+5:1 = 01011 (eval: 3)
New population evaluation average: 2.4
The goal condition has now been satisfied, and the procedure ends.
How well do GA's work?
Given candidates which are binary strings:
Population size N
Number of generations G
We sample NG points out of 2^l where l is binary string length
So what?
This can be done with random search methods
But GA's develop a pool of genes
Search is for 'schemas' which are 'blocks' or 'alleles' (portions of
the binary string which tend to be reproduced as a unit)
2^l schemas per individual
N*2^l schemas per population
[Smith, 88]
What applications can this be put to?
Miscellaneous:
The foremost researcher using GA's is really interested in 'classifier' systems, which incorporate GA's in formulating new classifiers. [Holland, 86]
Most classifier work seems to be done with binary string representation. Since this is what digital computers run off of, it has been noted that a classifier system is theoretically capable of forming any program which can be specified.
Another point of view is offered by Dress, who applies 'Darwinian' techniques to learning in artificial neural network models. Strictly speaking, his method is not Darwinian, but the issues he raises concerning 'programmer omniscience' are valid and need to be addressed [Dress, 87].
Random musings:
The area of GA research seems to have quickly lost track of the biological origin of the concepts used. For example, this quote from Holland:
'Rather surprisingly, and contrary to common wisdom in genetics, it can be shown that the crossover operator (the major source of recombination) is the primary contributor of improved rules.'
I can assure you that 'common wisdom' in genetics does not disagree with Holland's statement. The principle of 'hybrid vigor' has been noted and understood for some time.
Genetic algorithms provide a search method which offers the promise of fast convergence on a solution state. Usually, this is accomplished in a semi-random manner, so that intermediate states may not be observed. Thus, GA's are not applicable for replacing min-max game trees or other forms of search in which the change from previous state to current state must be incremental.
Background:
Genetic algorithms are premised on the principles of natural selection in biology. In searching for a solution state, a 'population' of candidate states are generated, evaluated, selected, and reproduced with modification to produce the next candidate population.
(BA == Biological analogue)
Structures and modules needed:
The organism has a set of structures which help determine its internal organization and capabilities. These are the genes, which in combination with environmental factors during development specify the formation of structures and connections. At maturity, the organism will have expressed a suite of characteristics which enable it to survive in its environment and propagate itself. The mode of reproduction which produces the most variance in the resulting offspring is sexual reproduction, which mixes genes from separate individuals to form the resulting offspring. Asexual reproduction is subject to variation only through external disturbances, such as radiation induced mutation or viral transcription, which produce variations that are small or relatively infrequent.
Various expressed characteristics of an organism may make it less successful than other organisms of the same species, where success is defined as 'differential reproductive success', in other words, the organism leaves more related offspring than those which do not demonstrate the same degree of adaptation. The next generation of organisms then proceed through the same processes.
The expressed characteristics of the organism are not necessarily the same characteristics as those coded for in the genes of the organism. The actual set of gene information is referred to as the genotype, while the expressed characteristics of the organism is referred to as the phenotype. The difference between code and expression can be a result of both the environmental factors and the interaction between genes on a pair of chromosomes.
Why genetic algorithms? Why, for that matter, neural networks? Both of these active research fields are premised on aspects of biological phenomena. Consider the complex problems which living systems must contend with in order to survive. The highly diverse fauna currently living gives some idea of the range of possible solutions to these problems. However, that is not the entire picture. Consider the currently extant number of species to be a small subset of instances in a far larger solution space.
The existing species can be thought of as representing 'splotches' on the surface of an expanding hypersphere. By no means do they represent the entire range of possible solutions. These species are postulated to have arisen by the process of natural selection.
Natural selection provides a mechanism for the identification and propagation of appropriate adaptation to specific and complex problems. Basically, natural selection can be described cursorily in the following manner. Several conditions are postulated.
The variation which is present in the population is generally assumed to arise at random. The selection process, however, is not random, but is premised on relative functionality with reference to the absolute constraints of the environment.
References:
Brady, RM. 1985. Optimization strategies gleaned from biological evolution. Nature 317(6040):804-806.
Dejong, K. 1980. Adaptive system design - a genetic approach. IEEE Syst. M. 10(9):566-574.
Dress, WB. 1987. Darwinian optimization of synthetic neural systems. IEEE Proc. ICNN Vol. 4:769-776.
Holland, JH. 1986. A mathematical framework for studying learning in classifier systems. Physica 22D(1-3):307-317.
Smith, D. 1988. Genetic algorithms, a speech presented to the Metroplex Institute for Neural Dynamics.