Stage IIÐtraining |
NN2 GRP1 |
NN3 GRP2 |
NN4 GRP3 |
|||
Number of data sets |
150 |
150 |
150 |
|||
Training tolerance |
0.5 |
0.5 |
0.5 |
|||
Number of good classi®cations achieved |
110 |
109 |
111 |
|||
Percentage of good classi®cations achieved |
74% |
73% |
74% |
|||
Learning rate used |
0.9, 0.016 |
0.9, 0.1 |
0.9, 0.1 |
|||
Number of hidden neurons |
11 |
11 |
11 |
|||
Table VI. Testing results for stage II networks (N |
N2, NN3, and NN4) |
|||||
Stage IIÐtesting |
NN2 |
NN3 |
NN4 |
|||
Size of data set |
20, 20, 20 |
20, 20, 20 |
20, 20, 20 |
|||
Number of good classi®cations achieved |
9, 11, 9 |
7, 12, 9 |
8, 10, 12 |
|||
Percentage of good classi®cations achieved |
45, 55, 45% |
35, 60, 45% |
40, 50, 60% |
|||
Testing tolerance |
0.5 |
0.5 |
0.5 |
|||
representing the correct selection group gets a value of at least 0.5, at the same time the other two output neurons (representing the remaining two groups) get values less than 0.5. The violation of either or both of these conditions will result in a bad classi®cation. We believe this criterion provides a reasonable level of accuracy in distinguishing a good classi®cation from a bad one.
Table VI presents the testing results for the stage II networks. The average testing accuracy (at 0.5 test tolerance) achieved are 48% for NN2, 47% for NN3, and 50% for NN4, thus giving a grand average test accuracy of 48%.
While the results shown in Table VI re¯ect the accuracy of the stage II networks, additional statistical tests are necessary to evaluate the performance of the neural network approach in the model selection process. To conduct a statistical test to measure the accuracy of the neural networks, a method similar to the one adopted by Hill, O'Connor and Remus (1996) is used in this research. Using MAPE as the basis for forecast accuracy, four randomly selected groups of data sets, each of size 20, are used to evaluate the performance of the stage I and stage II networks. For each data set in each group the best forecasting method (out of the nine methods) is identi®ed using MAPE as the basis. The results of stage I and stage II networks are obtained for each of the data sets. Paired t-tests are then conducted on the MAPE values for the best methods and the network selected methods for the four data sets (Iman and Conover, 1983).
In order to verify whether paired t-test is appropriate for comparing the MAPE values, a test for normality is conducted on the dierences in the MAPE values for each time series in the four datasets (Shapiro, 1990). Shapiro±Wilk W tests conducted suggest that three of the four data sets are normal (test statistics are: 0.895, 0.937, and 0.930) and therefore verify that the use of paired ttests is indeed appropriate for testing equality of MAPE values for the best methods and the network selected methods for three of the four data sets.
Table VII gives the results of the paired t-tests conducted to test equality of MAPE for the best forecasting methods and the neural network selected methods for the three groups of data. The results indicate that for two of the three test data sets, there is no signi®cant dierence (at the 0.001 level) in the mean values of MAPE for the best methods and the network selected methods.
Table VII. Means (and standard deviations) of MAPE for the best methods and the neural network selected methods
Data set |
Size (n) |
Mean of MAPEs for best methods |
mean of MAPEs for NNÐselected methods |
Results of paired t-test |
1 |
20 |
13.69 (15.36) |
14.67 (15.80) |
Signi®cant at 0.05 |
2 |
20 |
15.48 (18.18) |
16.91 (18.59) |
No signi®cance at 0.001 |
3 |
20 |
10.75 (11.96) |
11.69 (13.37) |
No signi®cance at 0.001 |
Уважаемый посетитель!
Чтобы распечатать файл, скачайте его (в формате Word).
Ссылка на скачивание - внизу страницы.