Applied stochastic models in business and industry. Stock timing using genetic algorithms, страница 9

4.4. Are experts similar for some groups of stocks?

It could be of interest to know if there are stocks that are characterized by similar experts, that is to say, by a similar subset of trading rules. In order to answer this question, we clustered binary vectors of experts into 2, 3 and 4 classes using two unsupervised classification methods: Kohonen self-adapting networks [19] and K-means [20]. Clustering into 2 and 3 classes was not very useful; we obtained one big cluster plus one or two clusters with a few stocks. Of greater interest was the classification into four classes, although the algorithms did not generate very stable clusters. For 24 stocks, about 40% remain together in the same i.e. Accor, Alcatel and Sanofi or Cap Gemini, Renault and Valeo. The other stocks were on the borders of the clusters. If one considers that vectors of experts form the vertices of a huge hypercube, then it is not surprising that the position of the classification hyper-surface has an important impact on the clusters’ content.

This experiment confirmed that the probability of finding only one set of superior rules (which can be used to trade a group of stocks efficiently) is very low. However, there are some similarities between some trading experts for a few stocks. This problem needs more extensive statistical analysis.

4.5. Do genetically created experts become ‘aged’?

The problem of aging knowledge rules is important in machine learning. In our study, it determines whether the genetic process has to be relaunched daily and, if necessary, whether the sets of rules have to be frequently updated. To answer this question we tested eleven experts during the course of 2 weeks commencing from 10 January 2000. The performances of three experts were then compared. The first report was on the performance of the best expert, generated on January 10 and used without modification throughout the period. The second report dealt with the performance of the expert generated on 10 January renewed on 12 January and then used until the end of the period. The third reported the performance of the expert renewed again on 14 January. The performances of the newly generated experts were better in 5– 7 cases over 11; but never worse than the previous ones. About 50% of regenerated experts were the same as in the previous set. The difference in the return of investment of the experts generated on 10 January and those of 14 January were a little higher than that between the experts when generated on the 10th and the 12th.

To sum up, in terms of return, it is preferable to regenerate the experts. However, the ‘aging’ of experts was not dramatic: all the experts showed good and stable profit performances throughout the out-of-sample test. The expert generation is not a very time-consuming process. The time required to discover the best expert for a given stock on PC K7 AMD processor was approximately 7 min: During this time, the genetic algorithm evaluated 400000 experts over a period of about 3 years of daily quotes beginning on 2 January 1997.

5.  CONCLUSION

In this paper the genetic approach to discovering an efficient trading expert has been presented. We have described the goals and constraints of the stock trading strategy as well as the structure of the evolutionary process to discover a best set of technical rules. The approach has been evaluated and validated on real data extracted from the Paris Stock Exchange.

The genetic algorithm is capable of efficiently pruning the very large search space, i.e. 2200 solutions for a 200-bits chromosome representation. The genetic process is not only able to derive efficient financial predictors, but is also able to extract the relevant indicators and function parameters. In contrast to other methods, the genetic models are robust and are not trapped in local minima. Furthermore, in comparison to many other forecasting methods, the genetic algorithm is also able to find out the size of a sliding window, determining how far back in time the input sequence is correlated with the next prediction.