Escolar Documentos
Profissional Documentos
Cultura Documentos
Predictor variables
Dependent Variable
Account length
Vmail message
VMail Message
Day Mins
Eve Mins
Night Mins
Intl Mins
CustServ Calls
Day Calls
Day Charge
Eve Calls
Eve Charge
Night Calls
Night Charge
Intl Calls
Intl Charge
State
Area Code
CHURN
Descriptive Statistics
N
Minimum
Maximum
Mean
Std. Deviation
Account Length
3333
243
101.06
39.822
VMail Message
3333
51
8.10
13.688
Day Mins
3333
.0
350.8
179.775
54.4674
Eve Mins
3333
.0
363.7
200.980
50.7138
Night Mins
3333
23.2
395.0
200.872
50.5738
Intl Mins
3333
20
10.24
2.792
Day Calls
3333
165
100.44
20.069
Day Charge
3333
.00
59.64
30.5623
9.25943
Eve Calls
3333
170
100.11
19.923
Eve Charge
3333
.00
30.91
17.0835
4.31067
Night Calls
3333
33
175
100.11
19.569
Night Charge
3333
1.04
17.77
9.0393
2.27587
Intl Charge
3333
.0
5.4
2.765
.7538
Area Code
3333
408
510
437.18
42.371
Valid N (listwise)
3333
From the above table VIF for all the variable is less than 10 which means no
variables can be dropped
Decision tree
KMO value
.510
7421.300
df
120
Sig.
.000
Observed
Predicted
0
Training
Percent Correct
1966
48
97.6%
246
83
25.2%
94.4%
5.6%
87.5%
805
31
96.3%
129
25
16.2%
94.3%
5.7%
83.8%
Overall Percentage
Test
Overall Percentage
Growing Method: CHAID
Dependent Variable: Churn
Training accuracy is 87.5 and testing is 83.8 means any new sample fed into this model will
be 83.3% accurate
Split 70:30 Method CRT
Classification
Sample
Observed
Predicted
0
Training
Percent Correct
1983
69
96.6%
172
177
50.7%
89.8%
10.2%
90.0%
767
31
96.1%
62
72
53.7%
88.9%
11.1%
90.0%
Overall Percentage
Test
Overall Percentage
Growing Method: CRT
Dependent Variable: Churn
Training accuracy is 90% and testing is 90% means any new sample fed into this model will
be 90% accurate
Classification
Sample
Observed
Predicted
0
Training
Percent Correct
1703
32
98.2%
193
86
30.8%
94.1%
5.9%
88.8%
1095
20
98.2%
147
57
27.9%
94.2%
5.8%
87.3%
Overall Percentage
Test
Overall Percentage
Growing Method: CHAID
Dependent Variable: Churn
Classification
Sample
Observed
Predicted
0
Training
Percent Correct
1542
142
91.6%
117
172
59.5%
84.1%
15.9%
86.9%
1078
88
92.5%
92
102
52.6%
86.0%
14.0%
86.8%
Overall Percentage
Test
Overall Percentage
Growing Method: CRT
Dependent Variable: Churn
Predicted
0
Percent Correct
2750
100
96.5%
234
249
51.6%
89.5%
10.5%
90.0%
Overall Percentage
Predicted
0
Percent Correct
2698
152
94.7%
241
242
50.1%
88.2%
11.8%
88.2%
Overall Percentage
Growing Method: CHAID
Dependent Variable: Churn
Method with CRT and spilt 70:30 will give the accuracy of 90 and testing also 90
Best method is CRT with spilt 70:30
Logistic regression
Classification Tablea
Observed
Predicted
Churn
0
Step 1
Churn
2759
Percentage
Correct
1
30
98.9
436
42
8.8
Overall Percentage
85.7
Step 1
df
Sig.
Step
357.955
15
.000
Block
357.955
15
.000
Model
357.955
15
.000
Sig if less than 0.05 which means data variables have impact on churn
Model Summary
Step
-2 Log likelihood
2361.871
Nagelkerke R
Square
Square
.104
.184
Neural network
Classification
Sample
Observed
Predicted
0
Percent Correct
Training
1933
53
97.3%
251
77
23.5%
94.4%
5.6%
86.9%
774
29
96.4%
108
41
27.5%
92.6%
7.4%
85.6%
Overall Percent
Testing
Overall Percent
Dependent Variable: Churn
.788
.788
Performance evaluation
Conclusion:
From the above analysis
Decision tree with 70:30 split CRT is best technique
This method will help in determining the churn depending on various factors in
future