Você está na página 1de 14

PREDICTING CUSTOMER

CHURN AT QWE INC.


Group10: Richard Ely, Yuchen Luo, Xinyu(Frank) Meng, Yijia He, Simeng Yin

Agenda
Executive Summary
Methodology
Multiple-variable Logistic Regression (MLR)
Decision Tree
Recommendation

Executive Summary
Problem: how to estimate the probability that a given customer would leave
and identify the drivers that contributed most to that customers decision
Decisions to make:
- methodology
- identify the 3 most influential variables related to probability of churn
Recommendation:
CHI (Customer Happiness Index) Score in December, change in login
recency, and change in login frequency are top three predictors
Decision Tree is a better model
QWE Inc. must analyze the cost of losing a customer and of retaining a
customer to determine the best predictive model

Relationship between Age and Churn


does not align with Mr. Wall's belief
Mr. Walls Belief of Age vs. Churn

Age 6 and 14 are not good cutoff points


Only customers age > 35 less likely to leave

Customer Age
(in month)

Likelihood to Churn

<6

Less Likely

6-14

Most Likely

>14

Least Likely

Percentage Churn by Age

Top 3 factors in Multiple-variable Logistic


Regression - CHI Score in Dec , Change in Login
Recency, Change in CHI Score
Best factors because:
Smaller p-value
Larger
standardized
coefficient
magnitude

Statistically significant

more weight in predicting


churn probability

Business insights:

Be aware of current
satisfaction level

Variable

Standardized
Coefficient

P-value

CHI Score in Dec.

-0.37

1.87e-07 ***

Days Since Last Login


(Dec-Nov)

0.31

6.30e-05 ***

CHI Score (Dec-Nov)

-0.29

2.80e-05 ***

Customer Age

0.17

0.00403 **

Views (Dec-Nov)

-0.36

0.00467 **

MLR with Five Variables Is Not Good at


Predicting Churn Customers
Methodology: Five variables with statistically significant coefficient
CHI Score
in Dec

Days since Last Login


(Dec-Nov)

CHI Score
(Dec-Nov)

Customer
Age

Views
(Dec-Nov)

Conclusion: MLR is more sensitive than SLR, but neither gives accurate prediction
Slight Improvement
- Smaller AIC and residual deviance
Doubtful Accuracy
- Huge error
- predict only 4.0% of churn customers (TPR = 4%)

Logistic
Regression Model

AIC

Residual
Deviance

Single-Variable

2510.6

2506.6

Multiple-Variable

2459.4

2447.4

Reasons:
K-Nearest
Neighbor not ideal

Hard to visualize with


more than 3 variables
Difficult to create
actionable insights
Comparatively, Decision
tree is clearer

Top 3 Predictors in Decision Tree - Change in Login


Recency, Change in Login Frequency, Customer Age

Business insight:

Be aware of change in
customer activeness
Age can be used to
segment customers

Change in Login
Recency < 18

Condition met
Condition unmet
Change in Login
Frequency >= 2.5

Predict: Stay
5406

218
Customer Age >= 22

Predict: Stay
218

11
Change in Views >= -140

Predict: Stay
163

Change in Login
Frequency >= 1

17
Age < 11.5

Predict: Stay
2

114

Age > 12

Predict: Stay
16

Change in Views >= 4

Predict: Stay
110

31
Predict: Stay
6

Predict: Churn
5

20

Predict: Churn
0

Decision Tree-An Extract of Predicted


Churn Customers
ID

Actual State Prediction

Logins Customer
(Dec-Nov)
Age

Days Since Last Login


Correct?
(Dec-Nov)

257

Churn

Churn

12

31

266

Churn

Churn

12

30

279

No Churn

Churn

12

31

317

Churn

Churn

12

31

335

Churn

Churn

-7

12

19

Good correct-prediction rate


Change in Login Recency > 18
Customer Age = 12
Change in Login Frequency < 1

Decision Tree Excellent in Avoiding


False Classification
Strengths
High precision (84.4%)
Low False Positive Rate (0.1%)
Business insight: better allocation of resources to help retention
Weaknesses
Low True Positive Rate (8.7%)
Business insight: inability in identifying all potential churn customers no
actions taken to retain them

Trade-off between level of accuracy &


number of predicted churns

Customer 627, 354, 5203 Churn Probability


Prediction by Models-Decision tree is clearer
Customer Probability of
ID
Churn (SLR)

Probability of Prediction of Churn


Churn (MLR)
Decision Tree

Actual State

672

3.3%

3.4%

No Churn

No Churn

354

3.5%

3.6%

No Churn

No Churn

5203

6.4%

4.1%

No Churn

No Churn

- Correct prediction generated by all models


- MLR is more accurate than SLR
- Decision tree generates a clearer answer

Recommendation
CHI Score in December, change in login recency, and
change in login frequency are top three predictors
QWE Inc. must analyze the cost of losing a customer and of
retaining a customer to determine acceptable accuracy
measure
- if cost of losing > cost of retaining: adjust decision tree to identify more churn customers
- if cost of losing < cost of retaining: use current decision tree that has a high precision rate

THANK YOU!

Você também pode gostar