Você está na página 1de 23

Artificial Intelligence based Classification in data mining for heart disease detection.

Presented by Vijaylaxmi Inamdar Under the Guidance of Ms. Shahida Asst Prof Dept of CSE,TOCE

Agenda

Introduction ANN Architecture Gradient descent algorithm Genetic algorithm How GA is more accurate than GD Conclusion Future work

Introduction

According to the World Health Organization heart disease and stroke kill some 17 million people every year, which is almost one-third of all deaths globally This project describes a new system for detection of heart disease based on feed forward neural network architecture and genetic algorithm. The term heart disease encompasses the diverse disease that affect the heart Design of diagnosis system for heart disease detection will become easy, cost effective, reliable and efficient.

ANN Architecture

Input Layer

Hidden Layer

Output Layer

Preprocessing

Normalization:
1. Xi = Xi/Xmax 2. Xi = (Xi-Xmin) / (Xmax-Xmin)

Fig: Preprocessing of data

Gradient Descent

Gradient descent is an optimization algorithm.

is a mathematical operator. When this is applied to a scalar function, result will be a vector quantity.
This will have magnitude and direction. Magnitude will give maximum rate of increment, direction gives in which direction this magnitude can be achieved. Purpose of learning is to find optimal set of weights to make error equal to zero. If we take gradient of error function the adjustment should be negative of the gradient.

Functionality

Example (Error=Target-Output)
T=1

=0.4 Error =1-0.4 =0.6

=0.6 Error =1-0.6 =0.4

Result
Testing Performance

CR : Correct Results WR : Wrong Results

The performance of ANN during training is shown in the figure the overall performance is 97%. In the fig 7.3 we can see the performance of ANN in detecting the disease .The performance is 93% in finding the patients with heart disease which is indicated as true +ve and 100% in detecting the patients who had no heart disease (true-ve).

Training Performance

CR : Correct Results WR : Wrong Results

The performance of ANN during testing is shown in the figures, the overall performance is 80%. In the fig we can see the performance of ANN in detecting the disease .The performance is 71% in finding the patients with heart disease which is indicated as (true +ve) and 87% in detecting the patients who had no heart disease (true-ve).

Genetic Algorithm

GA is a programming technique that mimics biological evolution as a problem solving strategy. Genetic algorithms are a class of search algorithms modeled on the process of natural evolution. With regard solely to the problem of weight selection, genetic algorithms are particularly good at efficiently searching large and complex spaces to find nearly global optima. As the complexity of the search space increases, genetic algorithms present an increasingly attractive alternative to gradient-based techniques such as back propagation. A second advantage of genetic algorithms is their generality.

Fig : Flow chart of basic genetic algorithm iteration

Genetic operators & Reproductive Operator


Two type of genetic operator

Crossover: It selects genes from parent chromosomes and creates a new offspring. Mutation: Mutation changes randomly the new offspring.

Tournament selection is the one type of Reproductive operator

In tournament selection a number of individuals is chosen randomly from the

population and the best individual from this group is selected as parent.

Functionality:

Crossover

Mutation

Chromosomes: These represents directly the weights of Neural Network.

1 chromosome is nothing but one set of weights of ANN

Result
Training Performance
Performace in Training data set 2% CR WR

True + 5% CR WR

95%

True -

100%
98%

DISTRIBUTION OF VALUES GENERATED BY MACHINE 60

50

density of solution

40

30

20

10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Testing Performance
True +

Performace in Test data set


21%

24%

CR WR

CR WR

79%

True 26%

76%
DISTRIBUTION OF VALUES GENERATED BY MACHINE 80 70 60

74%

density of solution

50 40 30 20 10 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

How GA is more Accurate than GD

Gradient descent algorithm is used as learning rule to train the weights. The train set and test set patient parameters are applied and verified. The result with train set data is 100 % and result with test set data is 88.89%. To improve accuracy Genetic algorithm is used as learning rule. The result with train set and test set data is 100%. Trained ANN predicts whether the heart disease is present or absent based on the patient parameters.

Conclusion

The conclusion is clear that Genetic Algorithm is more efficient than the gradient decent as the error surface is very irregular there may be one problem associated with the Gradient Decent algorithm and that is stucking in local minima which may generate suboptimal solution not always depends on application to overcome this problem can apply the heuristic Method (search method) to generate global solution i.e. Genetic Algorithm.

Future work

The ANN can be trained with large training set patient parameters. The input parameters can also be increased to have a more accurate decision and also it can be further enhanced to find the existence of other related diseases.

Você também pode gostar