Você está na página 1de 8

Expert Systems with Applications xx (2002) xxxxxx www.elsevier.

com/locate/eswa

A genetic algorithm application in bankruptcy prediction modeling


Kyung-Shik Shin*, Yong-Joo Lee1
College of Business Administration, Ewha Womans University, 11-1 Daehyun-dong, Seodaemun-gu, Seoul 120-750, South Korea

Abstract Prediction of corporate failure using past nancial data is a well-documented topic. Early studies of bankruptcy prediction used statistical techniques such as multiple discriminant analysis, logit and probit. Recently, however, numerous studies have demonstrated that articial intelligence such as neural networks (NNs) can be an alternative methodology for classication problems to which traditional statistical methods have long been applied. Although numerous theoretical and experimental studies reported the usefulness of NNs in classication studies, there exists a major drawback in building and using the model. That is, the user cannot readily comprehend the nal rules that the NN models acquire. We propose a genetic algorithms (GAs) approach in this study and illustrate how GAs can be applied to bankruptcy prediction modeling. An advantage of present approach using GAs is that it is capable of extracting rules that are easy to understand for users like expert systems. The preliminary results show that rule extraction approach using GAs for bankruptcy prediction modeling is promising. q 2002 Published by Elsevier Science Ltd.

1. Introduction Today, Korean nancial institutions are paying a heavy price for their indiscriminate practices. Corporate bankruptcies have put several institutions on the brink of insolvency. Many others have been merged with or acquired by other nancial institutions. Surviving institutions are rushing to put in place a corporate credit rating system, but are facing difculties due to lack of data accumulation and scientic credit rating methods. Our research pertains to a bankruptcy prediction modeling which can provide a basis for credit rating system. Early studies of bankruptcy prediction used statistical techniques such as multiple discriminant analysis (MDA) (Altman, 1968, 1983), logit (Ohlson, 1980) and probit (Zmijewski, 1984). Recently, however, numerous studies have demonstrated that articial intelligence such as neural networks (NNs) can be an alternative methodology for classication problems to which traditional statistical method have long been applied (Barniv, Agarwal, & Leach, 1997; Beaver, 1966; Bell, 1997; Boritz & Kennedy, 1995; Chung & Tam, 1992; Etheridge & Sriram, 1997; Fletcher & Goss,

* Corresponding author. Tel.: 82-2-3277-2799; fax: 82-2-3277-2776. E-mail addresses: ksshin@ewha.ac.kr (K.S. Shin), yjlee@ewha.ac.kr (Y.J. Lee). 1 Tel.: 82-2-3277-2795; fax: 82-2-3277-2776. 0957-4174/02/$ - see front matter q 2002 Published by Elsevier Science Ltd. PII: S 0 9 5 7 - 4 1 7 4 ( 0 2 ) 0 0 0 5 1 - 9

1993; Jo, Han, & Lee, 1997; Odom & Sharda, 1990; Salchenberger, Cinar, & Lash, 1992; Shin & Han, 1998; Tam & Kiang, 1992; Wilson & Sharda, 1994). Although numerous theoretical and experimental studies reported the usefulness of NNs in classication studies, there are several drawbacks in building and using the model. First, it is an art to nd an appropriate NN model which can reect problem characteristics because there are numerous network architectures, learning methods, and parameters. Second, the user cannot readily comprehend the nal rules that the NN models acquire. This characteristic of NNs is often referred to as black boxes. We propose a genetic algorithms (GAs) approach in this study and illustrate how GAs can be applied to corporate failure prediction modeling. An advantage of this approach is that it is capable of extracting rules that are easy to understand for users like expert systems. The remainder of this paper is organized as follows. Section 2 provides a description of articial intelligence applications for the bankruptcy prediction modeling, including a review of prior studies relevant to the research topic of this paper. Section 3 provides a brief description of GAs. Section 4 describes the rule extraction approach using genetic search. Section 5 reports the model development and the results of the experiments. Section 6 discusses the conclusions and future research issues.

ESWA 9965/8/200216:51KHADLEY49875 MODEL 5

EC

TE D

PR

Keywords: Genetic algorithms; Rule extraction; Bankruptcy prediction

K.-S. Shin, Y.-J. Lee / Expert Systems with Applications xx (2002) xxxxxx

2. Articial intelligence applications in bankruptcy prediction studies Prediction of corporate failure using past nancial data is a well-documented topic. Beaver (1966), one of the rst researchers to study bankruptcy prediction, investigated the predictability of the 14 nancial ratios using 158 samples that consisted of failed and non-failed rms. Beavers study was followed by Altmans (1968, 1983) model based on the MDA to identify the companies into known categories. According to Altman, bankruptcy could be explained quite completely by using a combination of ve (selected from an original list of 22) nancial ratios. Altman utilized a paired sample design, which incorporated 33 pairs of manufacturing companies. The pairing criteria were predicated upon size and industrial classication. The classication of Altmans model based on the value obtained for the Z score has a predictive power of 96% for prediction one year prior to bankruptcy. These conventional statistical methods, however, have some restrictive assumptions such as the linearity, normality and independence among predictor or input variables. Considering that the violation of these assumptions for independent variables frequently occurs with nancial data (Deakin, 1976), the methods can have limitations to obtain the effectiveness and validity. Recently, a number of studies have demonstrated that articial intelligence approaches that are less vulnerable to these assumptions, such as inductive learning, NNs, GAs and case-based reasoning (CBR) can be alternative methodologies for classication problems to which traditional statistical methods have long been applied. While traditional statistical methods assume certain data distributions and focus on optimizing the likelihood of correct classication (Liang, Chandler, & Han, 1990), inductive learning is a technology that automatically extracts knowledge from training samples, in which induction algorithms such as ID3 (Quinlan, 1986) and classication and regression trees (CART) generate a tree type structure to organize cases in memory. Thus, the difference between a statistical approach and an inductive learning approach is that different assumptions and algorithms are used to generate knowledge structures. Messier and Hansen (1988) extracted bankruptcy rules using rule induction algorithm that classies objects into specic groups based on observed characteristics ratios. They drew their data from two prior studies and began with 18 ratios. Their algorithm developed a bankruptcy prediction rule that employed ve of these ratios. This method was able to correctly classify 87.5% of the holdout data set. Shaw and Gentry (1990) applied inductive learning methods to risk classication applications and found that inductive learnings classication performance was better than probit or logit analysis. They have concluded that this result can be attributed to the fact that inductive learning is

free from parametric and structural assumptions that underlie statistical methods. Chung and Tam (1992) compared the performance of two inductive learning algorithms (ID3 and AQ) and NNs using two measures; the predictive accuracy and the representation capability. Results generated by the ID3 and AQ are more explainable yet they have less predictive accuracy than NNs. The predictive accuracy of ID3 and AQ is 79.5% while that of NN is 85.3%. Because NNs are capable of identifying and representing non-linear relationships in the data set, they have been studied extensively in the elds of nancial problems including bankruptcy prediction (Barniv et al., 1997; Bell, 1997; Boritz & Kennedy, 1995; Etheridge & Sriram, 1997; Fletcher & Goss, 1993; Lee, Han, & Kwon, 1996; Odom & Sharda, 1990; Salchenberger et al., 1992; Shin & Han, 1998; Tam & Kiang, 1992; Wilson & Sharda, 1994; Zhang, Hu, Patuwo, & Indro, 1999). NNs fundamentally differ from parametric statistical models. Parametric statistical models require the developer to specify the nature of the functional relationship such as linear or logistic between the dependent and independent variables. Once an assumption is made about the functional form, optimization techniques are used to determine a set of parameters that minimizes the measure of error. In contrast, NNs with at least one hidden layer use data to develop an internal representation of the relationship between variables so that a priori assumptions about underlying parameter distributions are not required. As a consequence, better results might be expected with NNs when the relationship between the variables does not t the assumed model (Salchenberger et al., 1992). The rst attempt to use NNs for bankruptcy prediction was done by Odom and Sharda (1990). The model had ve input variables the same as the ve nancial ratios used in Altmans study (Altman, 1968), and one hidden layer with ve nodes and one node for the output layer. They took a research sample of 65 bankrupt rms between 1975 and 1982, and 64 non-bankrupt rms, overall 129 rms. Among these 74 rms (38 bankrupt and 36 non-bankrupt rms) were used to form the training set, while the remaining 55 rms (27 bankrupt and 28 non-bankrupt rms) were used to make holdout sample. An MDA was conducted on the same training set as a benchmark. As a result, NNs correctly classied 81.81% of the hold out sample while MDA only achieved 74.28%. Tam and Kiang (1992) compared an NN models performance with a linear discriminant model, a logit model, the ID3 algorithm, and the k-nearest neighbor (k-NN) approach using the commercial bank failure data. The bank data were collected for the period between 1985 and 1987 and consisted of 59 failed and 59 non-failed. Among the evaluated models, NNs showed more accurate and robust results. Fletcher and Goss (1993) compared an NNs performance with a logit regression model. Their data were drawn from an earlier study and were limited to 36 bankrupt and

ESWA 9965/8/200216:51KHADLEY49875 MODEL 5

EC

TE D

PR

K.-S. Shin, Y.-J. Lee / Expert Systems with Applications xx (2002) xxxxxx

mended ratings for unseen cases (100 cases) 90.4% of the time. Shin and Han (1999) proposed a case-based approach to predict credit rating of rms. They used nearest neighbor matching algorithms to retrieve similar past cases. To nd an optimal or near optimal importance weight vector for the attributes of cases in case retrieving, they utilized a machine learning approach, GAs. Experimental results show that the model has a higher prediction accuracy rate (75.5%) than the individual methods of MDA, ID3, and CBR models with equal importance weights.

3. Genetic algorithm technique GAs are stochastic search techniques that can search large and complicated spaces on the ideas from natural genetics and evolutionary principle (Davis, 1991; Holland, 1975; Goldberg, 1989). They have been demonstrated to be effective and robust in searching very large spaces in a wide range of applications (Colin, 1994; Han et al., 1997; Koza, 1993; Shin & Han, 1999). GAs are particularly suitable for multi-parameter optimization problems with an objective function subject to numerous hard and soft constraints. The nancial application of GAs has been successful with growing number of applications in trading system (Colin, 1994; Deboeck, 1994), stock selection (Mahfoud & Mani, 1995), portfolio selection (Rutan, 1993), bankruptcy prediction (Kingdom & Feldman, 1995), credit evaluation (Shin & Han, 1999; Walker, Haasdijk, & Gerrets, 1995) and budget allocation (Packard, 1990). GAs are distinct from many conventional search algorithms in the following ways (Karr, 1995): 1. GAs consider not a single point but many points in the search space simultaneously reducing the chance of converging to local optima; 2. GAs work directly with strings of characters representing the parameter set, not the parameters themselves; 3. GAs use probabilistic rules, not deterministic rules, to guide their search. GAs perform the search process in four stages: initialization, selection, crossover, and mutation (Davis, 1991; Wong & Tan, 1994). Fig. 1 shows the basic steps of GAs. In the initialization stage, a population of genetic structures, called chromosomes, that are randomly distributed in the solution space, is selected as the starting point of the search. After the initialization stage, each chromosome is evaluated using a user-dened tness function. The role of the tness function is to numerically encode the performance of the chromosome. For real-world applications of optimization methods such as GAs, choosing the tness function is the most critical step. The mating convention for reproduction is such that only

non-bankrupt rms. Their model used three nancial variables, and because of the very small sample size, they used a variation of the 18-fold cross-validation analysis. The NN models had higher prediction rates than the logit regression model for almost all risk index cutoff values. Han, Jo, and Shin (1997) also compared an NN models performance with an MDA, a logit model, and suggested the post-model integration method by nding an optimal combinational weight of the individual models outputs. The sample data used in this experiment was composed of 1274 medium and small sized manufacturing companies that went bankrupt during the period between 1993 and 1995 and the same number of the non-bankrupt companies. Among the total of 2548 companies, 2293 companies are used for training and 255 are used for validation set. Among the models, the integrated model had the highest level of accuracies (79.1%) in the given data sets, followed by an NN model (78.7%), and a logit (76.5%). CBR is also applied for bankruptcy prediction (Bryant, 1997; Elhadi, 2000; Jo et al., 1997; Lee, 1993), and corporate bond rating problems (Buta, 1994; Shin & Han, 1999, 2001). This technique is fundamentally different from other major AI approaches. Instead of relying on making associations along generalized relationships between problem descriptors and conclusions, CBR benets from utilizing case specic knowledge of previously experienced problem situations. A new problem is solved by nding a similar past case and reusing it in the new problem situation (Kolodner, 1993). Buta (1994) developed a CBR model that predicts corporate bond rating using nancial data and ratings information of 1000 companies from 1991 to 1992 in the SandPs Compustat database. Although performance of the system varied considerably based on the specic rating class of the company, the system matched the SandP recom-

ESWA 9965/8/200216:51KHADLEY49875 MODEL 5

EC

TE D

PR

Fig. 1. Basic steps of GAs.

4 Table 1 The general rule structure Number ( j) Which data Less than/greater than or equal to Cutoff values

K.-S. Shin, Y.-J. Lee / Expert Systems with Applications xx (2002) xxxxxx

Cond 1 VAR1i L/G1k C1

Cond 2 VAR2i L/G2k C2

Cond 3 VAR3i L/G3k C3

Cond 4 VAR4i L/G4k C4

Cond 5 VAR5i L/G5k C5

Description VARji (i var. number, j condition number) L/Gjk (k 1 : less than/2: greater than or equal to) Cutoff Cj ( j condition number)

4. Rule extraction using GAs

Although numerous experimental studies reported the usefulness of NNs in classication studies, there is a major drawback in building and using a model in which the user cannot readily comprehend the nal rules that NN models acquire. An advantage of present approach using GAs is that it is capable of extracting rules that are easy to understand for users like expert systems. In extracting bankruptcy rule, we use the similar approach that Bauer (1994), and Mahfoud and Mani (1995) suggest in their stock selection applications. We apply GAs to nd thresholds (cutoffs) for one or more variables, above or below which a company is considered dangerous. For instance, if the models structure consists of two variables representing a particular companys quick ratio and a debt ratio, the nal rule the GA returns might look like the following:

ESWA 9965/8/200216:51KHADLEY49875 MODEL 5

EC

the high scoring members will preserve and propagate their worthy characteristics from generations to generation and thereby help in continuing the search for an optimal solution. The chromosomes with high performance may be chosen for replication several times whereas poorperforming structures may not be chosen at all. Such a selective process causes the best-performing chromosomes in the population to occupy an increasingly larger proportion of the population over time. The crossover forms a new offspring between two randomly selected good parents. The crossover operates by swapping corresponding segments of a string representation of the parents and extends the search for new solution in far-reaching direction. The crossover occurs only with some probability, which is called the crossover rate. There are many different types of crossover that can be performed: the one-point, the two-point, and the uniform type (Syswerda, 1989). The mutation is a GA mechanism where we randomly choose a member of the population and change one randomly chosen bit in its bit string representation. Although the reproduction and the crossover produce many new strings, they do not introduce any new information into the population at the bit level. If the mutant member is feasible, it replaces the member, which was mutated in the population. The presence of mutation ensures that the probability of reaching any point in the search space is never zero.

IF Debt ratio . 1:50 and Quick ratio , 0:35 THEN Dangerous: In many cases, the simplistic rule like the above example is insufcient to model relationships among nancial variables. Our rule structure contains ve conditions using AND relations. The general form of the rule that GAs generate is as follows: IF [the VAR1 is GREATER THAN OR EQUAL TO (LESS THAN) C1, AND the VAR2 is GREATER THAN OR EQUAL TO (LESS THAN) C2, AND, AND the VAR5 is GREATER THAN OR EQUAL TO (LESS THAN) C5] THEN Prediction is Dangerous. If all of the ve conditions are satised, then the model will produce dangerous signal for an evaluated company. C1 to C5 denote the cutoff values which are found through genetic search process. The cutoff values range from 0 to 1, and represent the percentage of the data sources range. This allows the rules to refer to any data source, regardless of the values it takes on. Above rule structure is summarized in Table 1. In the table, which data means data source the rule refers to. The model selects ve variables among nine alternative nancial ratios. The model is also allowed to choose one variable more than once because the rule structure of bankruptcy prediction is often highly non-linear. For example, a 10% increase in sales may result in a good signal, while a 100% increase in that same variable could result in a bad signal. This means that use of multiple cutoff points extracting knowledge from nancial variables is recommended, if necessary. In addition, considering the interactions among conditions, this is essential for increasing exibility of nancial modeling. In setting up the genetic optimization problem, we need the parameters that have to be coded for the problem and an objective or tness function to evaluate the performance of each string. The coded parameters are the cell values of Table 1. As we mentioned above, they are input variables, above or below, and the cutoff values. The varying parameters generate a number of combinations of our general rules. The string encoded for the experiments is as

TE D

PR

K.-S. Shin, Y.-J. Lee / Expert Systems with Applications xx (2002) xxxxxx Table 2 Selected variables Variables X1 X2 X3 X4 X5 X6 X7 X8 X9 Name Value added to total asset Net income to stockholders equity Quick ratio Liquidity ratio Current liability to total assets Retained earnings to total assets Stockholders equity to total assets Financial expenses to sales Operating income to operating expenses

ranges from 0.06 to 0.12 for our experiment. As a stopping condition, we use 3000 trials. These processes are done by the GAs software package Evolvere 4.0, called from an Excel macro.

5. Experiments and results 5.1. Data and variables The data set contains 528 externally audited mid-sized manufacturing rms, 264 of which led for bankruptcy and the other 264 for non-bankruptcy during the period 1995 1997. We apply two stages of input-variable-selection process. In the rst stage, we select 55 variables by factor analysis, independent-samples t-test (between input variable and output variable) and Mann-Whitney U test (for qualitative variables). In the second stage, we select nine nancial variables using the stepwise methods to reduce the dimensionality. The aim of input-variable-selection approach is to select the input variables satisfying the univariate test rst, and then select signicant variables by stepwise method for renement. As we mentioned above, these variables are not the nal ones that are used to form a rule, but are provided as the alternative variables for the nal selection. Table 2 illustrates the pre-selected variables for this study. The data set is split into two subsets, a training set and a validation (holdout) set of 90 and 10% of entire data, respectively. The training data are used for learning rules, and the validation data which have not been used to develop the system are used to test the results. 5.2. Results Our genetic search process nally extracts ve bankruptcy prediction rules. The ve rules generated and the corresponding descriptions are illustrated in Tables 3 and 4.

follows: String {VAR1i ; VAR2i ; VAR3i ; VAR4i ; VAR5i ; L=G1k ; L=G2k ; L=G3k ; L=G4k ; L=G5k ; C1 ; C2 ; C3 ; C4 ; C5 }: The GAs maintain a population of strings which are chosen at random. This initialization allows the GAs to explore the range of all possible solutions, and this tends to favor the most likely solutions. Generally, the population size is determined according to the size of the problem, i.e. bigger population for larger problem. The common view is that a larger population takes longer to settle on a solution, but is more likely to nd a global optimum because of its more diverse gene pool. We use 100 strings in the population. The task of dening a tness function is always application specic. In this study, the objective of the system is to nd a rule which would yield the highest hit ratio if rules are red across the company. Thus, we dene the tness function to be the hit ratios of the rule. The genetic operators such as crossover and mutation which are described in Section 3 are used to search for the optimal solutions. Several parameters must be dened for the above operators, and the values of these parameters can greatly inuence the performance of the algorithm. The crossover rate ranges from 0.5 to 0.7 and the mutation rate

Rule number Rule 1 Variable code . / , code Cutoffs Variable code . / , code Cutoffs Variable code . / , code Cutoffs Variable code . / , code Cutoffs Variable code . / , code Cutoffs

Table 3 The rules generated

EC

Cond 1

TE D
Cond 2 4 1 0.847 2 1 0.595 4 1 0.560 3 1 0.697 3 1 0.697

PR
Cond 3 5 1 0.520 3 1 0.697 6 2 0.082 6 2 0.130 6 2 0.082

F
Cond 4 7 1 0.595 7 1 0.590 7 1 0.590 7 1 0.577 7 1 0.590

Cond 5 8 1 0.665 8 1 0.503 8 1 0.520 8 1 0.515 8 1 0.520

Rule 2

Rule 4

Rule 5

ESWA 9965/8/200216:51KHADLEY49875 MODEL 5

Rule 3

2 1 0.426 2 1 0.520 2 1 0.426 2 1 0.560 2 1 0.560

6 Table 4 The description of rules Rule number Rule 1 Description

K.-S. Shin, Y.-J. Lee / Expert Systems with Applications xx (2002) xxxxxx

Rule 2 Rule 3

Rule 4

Rule 5

IF Net income to stockholders equity is less than 0.426a AND Liquidity ratio is less than 0.847 AND Current liability to total assets is less than 0.520 AND Stockholders equity to total assets is less than 0.595 AND Financial expenses to sales is less than 0.665, THEN Dangerous IF Net income to stockholders equity is less than 0.520 AND Quick ratio is less than 0.697 AND Stockholders equity to total assets is less than 0.590 AND Financial expenses to sales is less than 0.503, THEN Dangerous IF Net income to stockholders equity is less than 0.426 AND Liquidity ratio is less than 0.560 AND Retained earnings to total assets is greater than or equal to 0.082 AND Stockholders equity to total assets is less than 0.590 AND Financial expenses to sales is less than 0.590, THEN Dangerous IF Net income to stockholders equity is less than 0.560 AND Quick ratio is less than 0.697 AND Retained earnings to total assets greater than or equal to 0.130 AND Stockholders equity to total assets is less than 0.577 AND Financial expenses to sales is less than 0.515, THEN Dangerous IF Net income to stockholders equity is less than 0.560 AND Quick ratio is less than 0.697 AND Retained earnings to total assets is greater than or equal to 0.082 AND Stockholders equity to total assets is less than 0.590 AND Financial expenses to sales is less than 0.520, THEN Dangerous

Represent the percentage of the data sources range.

Rules

Rule 1 Rule 2 Rule 3 Rule 4 Rule 5 Average

79.0 80.7 82.6 81.6 80.2 80.8

Hit ratio Hit ratio Hit ratio Hit ratio Number of cases red (A) (B) (A) (B) 78.8 80.0 80.0 79.6 80.0 79.7 84.6 76.9 84.6 78.9 78.9 80.8 80.0 75.0 82.1 77.8 77.8 78.5 30 28 28 27 27 28

ESWA 9965/8/200216:51KHADLEY49875 MODEL 5

Train (476 cases)

Table 5 The performance of derived rules (%)

Validation (52 cases)

The general goal in optimization is to nd the best solution to a problem. Since GAs try to nd out the optimal or near optimal combination of above searching parameters, the nal solution is one. However, in most real-world problems, one does not usually know the best possible solution. Therefore, a more realistic objective is to nd alternatively good solutions. We generate multiple rules by choosing multiple strings in the converged population. Since the tness function of GAs measures the quality of a particular solution, we select the strings with high level of tness values. So the derived rules are alternatively good rules which show high level of hit ratio although there are minor differences in simulated performance. The hit ratios calculated from simulation results are summarized in Table 5. In Table 5, hit ratio (A) denotes the rate of correct classication if the rule is red, while hit ratio (B) represents overall classication accuracy of the set. The average hit ratio if the rules are red is 80.8% of training and validation sets, respectively. This means if the nancial variables of a company are within the feature ranges of derived rules, the probability of bankruptcy is about 80% of cases. The preliminary results above demonstrate that GAs are

EC

TE D

effective methods for extracting rules for the bankruptcy prediction. Their success is due to their ability to learn nonlinear relationships among the input variables. A drawback of this approach is that the model produces predictions only when the rules are red, while NNs make predictions on every case except when explicitly restricted. The average number of cases that are red by a specic rule is 28 among 52 cases (53.8%). This problem, however, can be reduced by integrating multiple rules derived. We have many ways to integrate these rules. For example, if one of the ve rules makes Danger signal, the model may produce Danger signal to the users.

6. Concluding remarks We applied GAs to extract rules that can predict corporate failure. This paper is just a rst attempt to explore the potential of genetic-based systems to handle bankruptcy prediction problems systematically. The results show that rule extraction approach using GAs for bankruptcy prediction modeling is promising. This paper, however, has several limitations. First, although we derived multiple rules using traditional GAs, it is necessary to extend the GAs through use of a niching method (Mahfoud & Mani, 1995). Unlike the traditional GAs, which makes the population eventually converge around a single point in the solution space, the GA that uses a niching method converges about multiple solutions or niches. Second, the current rule structure is quite limited. As a next research step, this structure will be considerably extended by incorporating additional features. It is likely that more informative features will possibly lead to improved results, although we should consider the efciency problem. Third, further improvements may be obtained by

PR

K.-S. Shin, Y.-J. Lee / Expert Systems with Applications xx (2002) xxxxxx

incorporating qualitative factors as well as quantitative ones. In our next research, we plan to include qualitative variables in extracting the prediction rules.

References
Altman, E. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23, 589609. Altman, E. (1983). Corporate nancial distressA complete guide to predicting, avoiding and dealing with bankruptcy. New York: Wiley. Barniv, R., Agarwal, A., & Leach, R. (1997). Predicting the outcome following bankruptcy ling: A three-state classication using neural networks. Intelligent Systems in Accounting, Finance and Management, 6, 177 194. Bauer, R. J. (1994). Genetic algorithms and investment strategies. New York: Wiley. Beaver, W. (1966). Financial ratios as prediction of failure. Empirical research in accounting: Selected studies. Journal of Accounting Research, 4, 71 111. Bell, T. (1997). Neural nets or the logit model? A comparison of each models ability to predict commercial bank failures. Intelligent Systems in Accounting, Finance and Management, 6, 249 264. Boritz, J., & Kennedy, D. (1995). Effectiveness of neural networks types for prediction of business failure. Expert Systems with Applications, 9, 503512. Bryant, M. S. (1997). A case-based reasoning approach to bankruptcy prediction modeling. Intelligent Systems in Accounting, Finance and Management, 6, 195 214. Buta, P. (1994). Mining for nancial knowledge with CBR. AI EXPERT, 9(2), 3441. Chung, H., & Tam, K. (1992). A comparative analysis of inductive learning algorithm. Intelligent Systems in Accounting, Finance and Management, 2, 3 18. Colin, A. M. (1994). Genetic algorithms for nancial modeling. In G. J. Deboeck (Ed.), Trading on the edge (pp. 148173). New York: Wiley. Davis, L. (1991). Handbook of genetic algorithms. New York: Van Nostrand Reinhold. Deakin, B. E. (1976). A discriminant analysis of predictors of business failure. Journal of Accounting Research, 167179. Deboeck, G. J. (1994). Using GAs to optimize a trading system. In G. J. Deboeck (Ed.), Trading on the edge (pp. 174188). New York: Wiley. Elhadi, M. T. (2000). Bankruptcy support system; taking advantage of information retrieval case based reasoning. Expert Systems with Applications, 18, 215219. Etheridge, H., & Sriram, R. (1997). A comparison of the relative costs of nancial distress models: Articial neural networks, logit and multivariate discriminant analysis. Intelligent Systems in Accounting, Finance and Management, 6, 235 248. Fletcher, D., & Goss, E. (1993). Forecasting with neural networks: An application using bankruptcy data. Information and Management, 24(3), 159167. Goldberg, D. E. (1989). Genetic algorithms in search, optimization and machine learning. Reading, MA: Addison-Wesley. Han, I., Jo, H., & Shin, K. S. (1997). The hybrid systems for credit rating. Journal of the Korean Operations Research and Management Science Society, 22(3), 163 173. Holland, J. H. (1975). Adaptation in natural and articial systems. Ann Arbor: The University of Michigan Press. Jo, H., Han, I., & Lee, H. (1997). Bankruptcy prediction using case-based reasoning, neural networks, and discriminant analysis. Expert Systems With Applications, 13(2), 97108. Karr, C. (1995). Adaptive control of an exothermic chemical reaction system using fuzzy logic and genetic algorithms. In L. R. Medsker (Ed.), Hybrid intelligent systems. New York: Kluwer.

Kingdom, J. & Feldman, K (1995). Genetic algorithms for bankruptcy prediction, London: Search Space Research Report No. 01-95, Search Space Ltd. Kolodner, J. (1993). Case-based reasoning. San Mateo, CA: Morgan Kaufmann. Koza, J. (1993). Genetic programming. Cambridge, MA: The MIT Press. Lee, H. Y (1993). Predictive insights through analogical reasoning: Application to screening new nancial service concepts. PhD thesis, The Wharton School, University of Pennsylvania. Lee, K. C., Han, I. G., & Kwon, Y. (1996). Hybrid neural network models for bankruptcy predictions. Decision Support Systems, 18, 6372. Liang, T. P., Chandler, J. S., & Han, I. (1990). Integrating statistical and inductive learning methods for knowledge acquisition. Expert Systems with Applications, 1, 391 401. Mahfoud, S., & Mani, G. (1995). Genetic algorithms for predicting individual stock performance. Proceedings of the Third International Conference on Articial Intelligence Applications on Wall Street, 174 181. Messier, W., & Hansen, J. (1988). Inducing rules for expert system development: An example using default and bankruptcy data. Management Science, 34(12), 14031415. Odom, M., & Sharda, R. (1990). A neural networks model for bankruptcy prediction. Proceedings of the IEEE International Conference on Neural Network, 2, 163168. Ohlson, J. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 18(1), 109 131. Packard, N. (1990). A genetic learning algorithm for the analysis of complex data. Complex Systems, 4, 543 572. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81 106. Rutan, E. (1993). Experiments with optimal stock screens. Proceedings of the Third International Conference on Articial Intelligence Applications on Wall Street, 269 273. Salchenberger, L., Cinar, E., & Lash, N. (1992). Neural networks: A new tool for predicting thrift failures. Decision Sciences, 23, 899 916. Shaw, M., & Gentry, J. (1990). Inductive learning for risk classication. IEEE Expert, 4753. Shin, K. S., & Han, I. (1998). Bankruptcy prediction modeling using multiple neural networks models. Proceedings of Korea Management Science Institute Conference. Shin, K. S., & Han, I. (1999). Case-based reasoning supported by genetic algorithms for corporate bond rating. Expert Systems with Applications, 16(2), 8595. Shin, K. S., & Han, I. (2001). A case-based approach using inductive indexing for corporate bond rating. Decision Support Systems, 32(1), 41 52. Syswerda, G. (1989). Uniform crossover in genetic algorithms. In J. D. Schaffer (Ed.), Proceedings of Third International Conference of Genetic Algorithms. San Mateo: Morgan Kaufmann. Tam, K., & Kiang, M. (1992). Managerial applications of neural networks: The case of bank failure predictions. Management Science, 38(7), 926 947. Walker, R., Haasdijk, E., & Gerrets, M. (1995). Credit evaluation using a genetic algorithm. In S. Coonatilake, & P. Treleaven (Eds.), Intelligent systems for nance and business (pp. 39 59). New York: Wiley. Wilson, R., & Sharda, R. (1994). Bankruptcy prediction using neural networks. Decision Support Systems, 11(5), 545 557. Wong, F., & Tan, C. (1994). Hybrid neural, genetic and fuzzy systems. In G. J. Deboeck (Ed.), Trading on the edge (pp. 245 247). New York: Wiley. Zhang, G., Hu, Y. M., Patuwo, E. B., & Indro, C. D. (1999). Articial neural networks in bankruptcy prediction: General framework and crossvalidation analysis. European Journal of Operational Research, 116, 16 32. Zmijewski, M. E. (1984). Methodological issues related to the estimated of nancial distress prediction models. Journal of Accounting Research, 22(1), 5982.

ESWA 9965/8/200216:51KHADLEY49875 MODEL 5

EC

TE D

PR

K.-S. Shin, Y.-J. Lee / Expert Systems with Applications xx (2002) xxxxxx

Kyung-Shik Shin is an Assistant Professor of Management Information Systems, College of Business Administration, at the Ewha Womans University in Korea. He received his MBA from the George Washington University and PhD from the Korea Advanced Institute of Science and Technology in 1998. His research interests include decision support systems, intelligent systems, data mining, articial intelligence applications for business and electronic commerce.

Yong-Joo Lee is an Associate Professor of Management Science/Operations Research, College of Business Administration, at the Ewha Womans University. He received MS degree from the Korea Advanced Institute of Science and Technology in 1982, and the PhD degree in 1991 from the Graduate School of Business, Columbia University in New York. His primary research interests are applications of linear and integer programming methods in portfolio selection, scheduling and resource allocation problems, and modeling and analysis of queueing network problems. Some of his papers are published in Operations Research and in the International Journal of Production Economics.

ESWA 9965/8/200216:51KHADLEY49875 MODEL 5

EC

TE D

PR

Você também pode gostar