Genetic programs, data-snooping, and technical analysis, страница 6

Mean           0.2741    0.3031    0.2869    0.2822 SD            0.1846    0.1533    0.1406    0.1340

                         Median                       0.2763                        0.2930                        0.2901                       0.2888

Proportion of coincident daily position identification

“True” position

Position of estimated rule

Long                           0.9356

0.9500

0.9505

0.9420

Neutral                        1.1537

0.1003

0.0857

0.1059

Short                           0.9360

0.9646

0.9682

0.9722

Correlation between average daily returns of “true” and estimated rules

Pearson                     0.7981                         0.9523

Test of equality of means of “true” and estimated rules

0.9475

0.9620

Test statistic               1.5522                         0.7189

0.9747

0.8976

p-Value                      0.1206                         0.4722

0.3297

0.3694

the simulated rule. The fourth set of results reports the correlation between the returns of the true and estimated rules, and the fifth set is tests of the inequality of returns of the true and estimated rules.

The statistical tests indicate that although the null hypothesis of inequality can be rejected at all of the simulation lengths a data length of 250 produces a p-value of .1206, indicating that a longer data series is likely warranted. Although the p-value is highest for the 500 observation series, this seems to be an artifact of the numerical implementation of the data-generating process, and should disappear if more simulations are used. This appears to be born out by inspection of Figure 3, which plots the difference between the estimated and simulated profits for various data lengths. It does not appear that lengths or 750 or 1,000 observations significantly improve estimation results, and therefore, in the remainder of this article, 2 years of price data are used in both the training and selection of trading rules.

These results provide guidance on the length of price data needed to recover an underlying technical pattern in prices. As with other simulation studies, these results apply only to the stochastic process used for the simulations, and care must be taken in extrapolating beyond the model used here. However, they do demonstrate that GP is capable of identifying technical patterns in price data.

FIGURE 3

Comparison of trading rule performance by length of simulated data.

RESULTS

Optimal trading rules are estimated for the 24 commodities listed in Table IV. Front-month futures prices are used to create a rolling price series.Observationsthatoccurduringthedeliverymonthareexcluded.The futurespricesusedaredrawnfromtheCRBfuturespricedatabase.Prices from January 2, 1980 through December 29, 2000 were used. Subtracting the 4 years allocated to training and selection data, 204 months of data

TABLE IV

In-Sample Returns to GP Technical Trading Rules