Intelligent systems in accounting, finance and managementassessing predictive performance of ann-based classifiers, страница 9

The first chromosome of the population is the set of weights obtained when determining the ANN architecture. The other 19 chromosomes are generated by training the ANN with the previously obtained architecture. Afterwards, the first generation of the algorithm may begin. The number of generations is related to the empirical formula suggested by Ankenbrandt (1991). The number of generations for a non-binary GA without mutation is given by the formula Ngen = ln[(n − 1)2]/ln(r), where n is the population size and r is the average fitness of candidates with a particular gene value over the average fitness of all other candidates.

Each chromosome is evaluated using the accuracy rate for the training set ACRTR.

Selection

First, the elitism technique is applied, in the sense that the best Nelite chromosomes in terms of ACRTR are inserted into the new population. The rest of the chromosomes (20 − Nelite) are selected based

3 600 = 10 (number of runs for each GA experiment) × 4 (number of crossover operators) × 3 (number of preprocessing methods) × 5 (number of different distributional data sets).

on the probability of selection (roulette wheel procedure) for each chromosome: Pi = ACRTR i/

 ACRTR i.

The higher the probability Pi for a chromosome is, the higher its chance of being drawn into the new population. This procedure tries to simulate the process of natural selection or survival of the fittest. We decided to employ elitist selection in our algorithms as a consequence of what is reported in the literature. For example, Rudolph (1994) proved by means of homogeneous finite Markov chains that GAs converge probabilistically to the global optimum only when elitist selection is used (the best individual survives with probability one). Miller and Thomson (1998) used GAs to evolve digital circuits with a new chromosome representation and found that ‘without elitism the GA struggled to find any fully correct solutions for what is essentially a very simple circuit, but with elitism the results were markedly improved’. Shimodaira (1996) developed a GA with large mutation rates (controlled by a decreasing function of generation) and elitist selection, i.e. GALME, and found that the performance of GALME is remarkably superior to that of a traditional GA. Fogel et al. (2004) applied EAs for similar RNA structure discovery and focused on the optimization of population and selection parameters. The study compared elitist selection with three different tournament selections (tournament size 5, 10 and 20) and found that the increased tournament size increases the variance in the mean convergence and that tournament size 5 achieved slightly better mean variance than elitist selection. However, the number of clients (workstations) that arrived at correct solutions was roughly similar when elitist and tournament size 5 selection were employed.

Next, 80% (probability of crossover: Pc = 0.8) of the chromosomes obtained previously are randomly selected for mating. The choice of crossover probability, as well as of the other GA parameters (mutation probability, population size), is more art than science. Tuson and Ross (1998) compared the performance of non-adaptive GAs (GAs with fixed crossover and mutation probabilities) with the performance of adaptive GAs (GAs that use operators’ adaptation) and found that ‘. . . at least for the problems used here, adaptation by the co-evolution of operator settings is of limited practical use, except in situations where the tuning process is sufficiently time limited’. They suggested that the proper choice of the crossover in the case of non-adaptive GAs depends upon the population model, the problem to be solved, its representation and the performance criterion being used. For example, for a ‘deceptive’ problem ‘a low crossover probability gives high quality results, whereas a high crossover probability exchanges solution quality for a higher speed of search’ (Tuson and Ross, 1998). Rogero (2002) found that an increase on the crossover probability above 0.3 does not improve the convergence speed of the GA. However, the author mentions that this value is problem dependent. The problem with choosing high crossover probabilities is that potentially very performant parents would be removed from the population. This is not the case for our GA implementation, since after reproduction we increase the population to include both the parents and their offspring. In other words, the probability of crossover is not essential for the performance of our algorithm as long as it has a high value.