Intelligent systems in accounting, finance and managementassessing predictive performance of ann-based classifiers, страница 4

Fogel et al. (1995, 1998) used ANNs trained with EAs to analyse interpreted radiographic features from film screen mammograms. The results show that even small ANNs (with two hidden nodes and a small number of important features) can achieve comparable results with much more complex ones. These small networks provide ‘an even greater chance of explaining the evolved decision rules that are captured by the ANNs, leading to a greater acceptance by physicians’ (Fogel et al., 1998). Chellapilla and Fogel (1999) combined ANNs and EAs to learn appropriate and, sometimes (e.g. chequers) near-expert, strategies in zero- and non-zero-sum games, such as iterated prisoner’s dilemma, tic-tac-toe, and chequers.

Many workers (e.g. Schaffer, 1994) found that GA-based ANNs are not as competitive as their gradient-descent-like counterparts are. Sexton et al. (1998) argued that this difference has nothing to do with the GA’s ability to perform the task, but rather with the way it is implemented. The candidate solutions (the ANN weights) were encoded as binary strings, which is both unnecessary and unbeneficial (Davis, 1991; Michalewicz, 1992) when the ANN has a complex structure. The tendency is toward using non-binary (real) values for encoding the weights.

There are few research papers that have studied the implications of data distributions on the predictive performance of ANN. Bhattacharyya and Pendharkar (1998) studied the impact of input distribution kurtosis and variance heterogeneity on the classification performance of different machine learning and statistical techniques for classification. Pendharkar (2002) studied the application of a non-binary GA for learning the connection weights of an ANN under various structural design and data distributions, finding that additive noise, size and data distribution characteristics play an important role in the learning, reliability and predictive ability of ANNs. Pendharkar and Rodger (2004) studied the implications of data distributions determined through kurtosis and variance–covariance homogeneity (dispersion) on the predictive performance of GA-based and gradient-descent-based ANNs for classification. Also, Pendharkar and Rodger (2004) studied the implication of three different types of crossover operator (one-point, arithmetic and uniform crossover) on the prediction performance of GA-based ANNs. No significant difference was found between the different crossover operators. However, GAs based on uniform and arithmetic crossover operators performed differently at a significance level of 0.1, suggesting that there might be a statistically significant difference for larger networks (Pendharkar and Rodger, 2004). In Section 4 we present how our study differs from the above-mentioned studies.

Neural network training can be made more efficient if certain preprocessing steps are performed on the network inputs and targets (Demuth and Beale, 2001). Zupan et al. (1998) proposed a classification technique (HINT) that is based on function decomposition for the transformation of the input feature space. The idea is to separate the input space into two less complex disjoint feature spaces that when recombined yield the original input feature space. The original input feature space can be reduced if one of the two disjoint feature spaces has redundant features. Zupan et al. (1998) used as case studies two well-known machine-learning problems (monk1, monk2) and a housingloan allocation problem. They compared their system (HINT) with Quinlan’s C4.5 decision-tree algorithm in terms of prediction accuracy, finding that for all the above problems the system based on function decomposition yielded significantly better results.

In this study we discuss the effect of three factors (data distribution, preprocessing method and training mechanism) and their combinations on the prediction performance of ANN-based classification models. There is no research literature (Alander, 1995) that has studied the combined impact of the above-mentioned factors on ANN classification performance. This study tries to fill this gap in the literature. We compare two different ANN training mechanisms for pattern classification: one based on traditional gradient-descent training algorithms (RT-based ANN) and the other based on natural selection and evolution (GA-based ANN). We also propose an empirical procedure to determine the ANN architecture, which is kept fixed for both training mechanisms. The starting solution (initial set of weights) for both training mechanisms is obtained when we determine the ANN architecture. We reveal classes of financial performance for the companies in the telecommunications sector based on profitability, liquidity, solvency and efficiency financial ratios. These ratios are suggested in Lehtinen’s (1996) study of the reliability and validity of financial ratios in international comparisons.