Intelligent systems in accounting, finance and managementassessing predictive performance of ann-based classifiers, страница 12

One of the goals of this study is to find out whether the combination of the preprocessing approach and the input data distribution has an impact on the ANN classification performance. At the same time, we are interested in whether the data distribution has any influence on the choice of training technique when ANNs are applied in financial classification problems. In other words, does the data distribution–training mechanism combination have any impact on the ANN classification performance? Consequently, data with different distributions have to be generated. Some workers (e.g. Bhattacharyya and Pendharkar, 1998; Pendharkhar, 2002; Pendharkar and Rodger, 2004) studied the impact of the data distributions through kurtosis and variance–covariance homogeneity on the classification performance of ANNs. In the aforementioned studies, the workers used fictive datasets drawn from uniform, normal, logistic and Laplace distributions, arguing that they roughly correspond to the kurtosis values −1, 0, 1 and 3 respectively. In line with the above research, we study the implications of input data distributions by using five datasets with different distributions: the real data, uniform, normal, logistic and Laplace distributions. We show the descriptive statistics in Section 5, including kurtosis and skewness values for the financial ratios of the real telecom dataset. We used the characteristics of the real data to derive four fictive datasets with uniform, normal, logistic and Laplace distributed data.

In this study we analyse the implications of three different factors (preprocessing method, data distribution and training mechanism) and their combinations on the classification performance of neural networks.

We compared our research questions with what was previously reported in the literature (e.g. Pendharkar and Rodger, 2004). However, there are some important differences in the assumptions in our study compared with those in other studies:

•  The main difference is that here the GA and gradient-descent methods are used to refine the classification accuracy of an already obtained solution for the classification problem. Both the GA and the RT-based ANNs start from a solution provided when determining the ANN architecture and they try to refine it. All other studies compared the GA and gradient-descent methods starting from random solutions. We expect that the GA-based ANN will outperform the RT-based ANN in refining what the ANN already learned due to the GA’s better searching capabilities.

•  The second main difference is the type of the classification problem itself. Here, we are interested in separating the input space into more than two parts (e.g. seven financial performance classes), providing more insights in the data.

•  We are interested in whether the combination of preprocessing approach, distribution of the data,and training technique has any impact on the classifiers’ predictive performances.

•  Here, non-parametric statistical tests are used to validate the hypotheses. Only t-tests or analysis of variance (ANOVA) were used in the other studies, but no evidence of satisfaction of the assumptions was provided. We performed a three-way ANOVA to strengthen the results of the non-parametric tests.

•  Also, four different crossover operators are used in order to find whether this operator has an influence on the GA’s predictive performance. We introduce one crossover operator, i.e. a multipoint crossover, in addition to the three crossover operators presented in Pendharkar and Rodger (2004).

The first difference has an impact on all the hypotheses that we formulate in this study, since, here, there is a different problem. The GA- and RT-based ANNs improve an already existing solution and do not construct it from scratch. Their behaviour depends on how that solution was obtained (using what kind of method).