Другие предметы \ Проектирование

Journal of Forecasting. Whittemore School of Business and Economics, The University of New Hampshire, USA, страница 7

Stage IIÐtraining		NN2 GRP1			NN3 GRP2	NN4 GRP3
Number of data sets		150			150	150
Training tolerance		0.5			0.5	0.5
Number of good classi®cations achieved		110			109	111
Percentage of good classi®cations achieved		74%			73%	74%
Learning rate used		0.9, 0.016			0.9, 0.1	0.9, 0.1
Number of hidden neurons		11			11	11
Table VI. Testing results for stage II networks (N		N2, NN3, and NN4)
Stage IIÐtesting			NN2		NN3	NN4
Size of data set	20, 20, 20			20, 20, 20		20, 20, 20
Number of good classi®cations achieved	9, 11, 9			7, 12, 9		8, 10, 12
Percentage of good classi®cations achieved	45, 55, 45%			35, 60, 45%		40, 50, 60%
Testing tolerance	0.5			0.5		0.5

representing the correct selection group gets a value of at least 0.5, at the same time the other two output neurons (representing the remaining two groups) get values less than 0.5. The violation of either or both of these conditions will result in a bad classi®cation. We believe this criterion provides a reasonable level of accuracy in distinguishing a good classi®cation from a bad one.

Table VI presents the testing results for the stage II networks. The average testing accuracy (at 0.5 test tolerance) achieved are 48% for NN2, 47% for NN3, and 50% for NN4, thus giving a grand average test accuracy of 48%.

While the results shown in Table VI re¯ect the accuracy of the stage II networks, additional statistical tests are necessary to evaluate the performance of the neural network approach in the model selection process. To conduct a statistical test to measure the accuracy of the neural networks, a method similar to the one adopted by Hill, O'Connor and Remus (1996) is used in this research. Using MAPE as the basis for forecast accuracy, four randomly selected groups of data sets, each of size 20, are used to evaluate the performance of the stage I and stage II networks. For each data set in each group the best forecasting method (out of the nine methods) is identi®ed using MAPE as the basis. The results of stage I and stage II networks are obtained for each of the data sets. Paired t-tests are then conducted on the MAPE values for the best methods and the network selected methods for the four data sets (Iman and Conover, 1983).

In order to verify whether paired t-test is appropriate for comparing the MAPE values, a test for normality is conducted on the dierences in the MAPE values for each time series in the four datasets (Shapiro, 1990). Shapiro±Wilk W tests conducted suggest that three of the four data sets are normal (test statistics are: 0.895, 0.937, and 0.930) and therefore verify that the use of paired ttests is indeed appropriate for testing equality of MAPE values for the best methods and the network selected methods for three of the four data sets.

Table VII gives the results of the paired t-tests conducted to test equality of MAPE for the best forecasting methods and the neural network selected methods for the three groups of data. The results indicate that for two of the three test data sets, there is no signi®cant dierence (at the 0.001 level) in the mean values of MAPE for the best methods and the network selected methods.

Table VII. Means (and standard deviations) of MAPE for the best methods and the neural network selected methods

Data set	Size (n)	Mean of MAPEs for best methods	mean of MAPEs for NNÐselected methods	Results of paired t-test
1	20	13.69 (15.36)	14.67 (15.80)	Signi®cant at 0.05
2	20	15.48 (18.18)	16.91 (18.59)	No signi®cance at 0.001
3	20	10.75 (11.96)	11.69 (13.37)	No signi®cance at 0.001

1 2 3 4 5 6 7 8 9

Скачать файл