Você está na página 1de 35

Data Science and Operations

Marshall School of Business


University of Southern California
Case 3 LAB Smart Party
Ware
BUAD425 -2 units
DATA ANALYSIS FOR DECISION MAKING
Data Science and Operations
Marshall School of Business
University of Southern California
Scenai!
Applichem is interested in diversifying its portfolio, one of the company it is interested in is Smart Partyware
Company Smart Partyware Company !SP"# is in the niche party ware $usiness, currently with a fi%ed
customer $ase, and that they sell innovative plastic party ware to their mem$ers Since they sell a plastic
product, they may $e a good vertical ac&uisition target
'he Smart Partyware Company(s $usiness model is direct)to)consumer mar*eting Over the years they have
gained dedicated upscale customers and currently have +,,,,,, mem$ers in their data$ase
-n the direct)mar*eting industry, the response rate is measured as a percentage of customers who $uy the
directly mailed product Smart Partyware(s historical response rate for direct mail to selected mem$ers is
appro%imately .,/0far a$ove the industry average SP" was using 12M !1ecency)2re&uency)Monetary#
analysis to target customers Smart Partyware wants to increase the response rate well $eyond the .,/ rate
SP" designs new party ware for every campaign, gives a new name to its party ware, and $roadly classifies
the party ware under one of its many party themes Most of the designs cut across many themes $ut are
classified into a particular category $ased on the main design theme in the party ware 'he recent product to
$e mar*eted is Cele$rating American Arts -t has famous American art wor*s printed in the party ware, and
even though it falls under the Art Party theme the party ware can $e used as well for pool or $ar$e&ue or one
of the other parties 2or analysis purposes, if the mem$er $ought the American Arts pac*age, the value of the
Art Party varia$le increases $y one
E"#i$it %& 'atia( List !) *aia$(es in S'+ Data$ase
3aria$le 4ame Desci,ti!n
Se&5 Se&uence num$er in the partition
-D5 -dentification num$er in the full !partitioned# mar*et test data set
6ender O 7 Male . 7 2emale
M Monetary0total money spent on Partyware
1 1ecency0months since last purchase
2 2re&uency0total num$er of purchases
2irstPurch Months since first purchase
Sports Party 4um$er of purchases from the category8 Sports Party
Pool Party 4um$er of purchases from the category8 Pool Party
Bar$e&ue Party 4um$er of purchases from the category8 Pool Party
Birthday Party 4um$er of purchases from the category8 Birthday Party
9nd)of)School)'erm Party 4um$er of purchases from the category8 9nd)of)School):ear Party
Art Party 4um$er of purchases from the category8 Art Party
Bloc* Party 4um$er of purchases from the category8 Bloc* Party
Coo*ing Party 4um$er of purchases of the category8 Coo*ing Party
6et 'ogether 4um$er of purchases of the category8 6et 'ogether
Movie 4ight 4um$er of purchases of the category8 Movie night
Success 7. Cele$rating American Arts was $ought, 7 , if not
5--. / ,!,u(ati!n0 2. / sa1,(e si2e
9ach mar*eting campaign starts with a trial mar*eting of ;,,,, mem$ers8 the newly designed party ware is
sent to ;,,,, randomly selected mem$ers from the data$ase, and they have one wee* to respond 'he
Data Science and Operations
Marshall School of Business
University of Southern California
pac*ages come with paid return postage< if the mem$er li*es it, he or she can *eep it, otherwise they have to
return it within one wee* After two wee*s, SP" has all the data it needs to go for mass mar*eting 'he
current company policy is n!t t! sen3 ,ac.a4es t! 1!e t#an %--5--- 1e1$es so that the mem$ers do not
$ecome tired of repeated mar*eting campaigns 'he mem$ers always have the opportunity to visit the SP"
"e$site and $uy current and old pac*ages Most of the old pac*ages are returned pac*ages from mar*eting
campaigns and are sold at discounted values After analy=ing the recent Cele$rating American Arts trial
mar*eting data, it is found that ..;/ of the mem$ers have $ought the new pac*age
'he selling price for the pac*age is >?,, the mailing cost is >@+,, and the return mail cost is the same 'he
total cost of producing the pac*age is >., -f the pac*age is returned, it can $e sold at discounted rate or
destroyed0historically the e%pected salvage value has $een >.+ Based on these assumptions, it is calculated
that if the pac*age is mailed to .,,,,,, randomly selected mem$ers then the profit from the mar*eting
campaign will $e 6%545---5 and if they can mine the data perfectly and send only to the mem$ers interested in
the pac*age they will ma*e 6255475---891a" ,!)it t#at t#e c!1,an: can 4et )!1 t#e ca1,ai4n i) t#e:
.n!; ;#! is 4!in4 t! $e t#e $u:e5 $ut essentia((: t#is is i1,!ssi$(e< 'he range is e%tremely wide0
currently, SP" is ma*ing an average profit of >A,,,,,, per mar*eting campaign and a yearly profit of >B@
million
Selling price per Product ?,
Cost per Product .,
Salvage 3alue per Product .+
Cost of Mailing the Product @+
Cost of 1eturning the Product @+
Based on the 'raining Data Set,
Cevel Count Pro$
4on Buyer BBB ,BBB,,
Buyer ..; ,..;,,
'otal .,,, .,,,,,
Buyers 7 ..;/ $uyers for this Product and if we assume there are +,,,,,, potential mem$ers, then the total
num$er of Buyers in the +,,,,,, mem$ers is +,,,,,, D ,..; 7 +?,,,, Buyers
Profit per Product after mailing cost
?,).,)@+7
@++
Cost of Mailing the Product to not a
$uyer
).,)@+)@+E.+
7 )@
'he Ma%imum profit that can $e made is +?,,,, D @++ 7 >;,+@B,,,,, if we mail Product only to the $uyers
-f we mail more Products than some of the Products will $e returned and it will cost us money 'his cost is >@
7 !Product cost F Salvage Epostage $oth ways# 7 !.,).+ EG#
Mar*eting department has suggested it is prudent to mail the Products only to ma%imum of .,,,,,, mem$ers
so the Product clu$ mem$ers do not $ecome tired of repeated mar*eting campaign
Cet us calculate the Baseline profit if we mail Product randomly
Data Science and Operations
Marshall School of Business
University of Southern California
7 .,,,,,, D ,..; D @++ E .,,,,,, D,BBBD !)@# 7 .+@,,,,
'he Cow case Scenario is .+@,,,, and the $est case Scenario is >;,+@B,,,,
'here are two ways to increase our $aseline profit, increase the percentage of identification of $uyers and
reduce the num$er of Products shipped !the range will $e $etween +?,,,, to .,,,,,,#
Our o$Hective is to $eat the average profit of >A,,,,,, $y using decision tree method or $y using the logistic
regression method
'(an !) acti!n
Use the recent Cele$rating American Arts trial mar*eting data to prove we can do a $etter Ho$ than 12M
analysis
. Provide calculations to show that the Ma%imum profit $ased on the training data is >;+@B
Million
; Provide calculations to show that the profit $ased on the training data is >,.+@ Million, if
.,,,,,, pac*ages are mailed randomly to mem$ers
I Build the Best Decision tree Model using JMP !G! !,ti!n# on the following conditions,
: 7 Success
K 7 All predictors
Cutoff Pro$a$ility for mailing 7 ,.+
a -nterpret the decision treeL
$ -nterpret 1
;
and how many splits did you have in the modelL
c 9%amine each of the split varia$les to e%plain whether they ma*e $usiness senseL
d Create the confusion matri% for the testing data set !cutoff Pro$ 7 ,.+#
e "hat is the e%pected profit $ased on the confusion matri%
@ Build the Best Cogistic 1egression Model !stepwise# using JMP on the following conditions,
: 7 Success
K 7 All predictors
Cutoff Pro$a$ility for mailing 7 ,.+
a "hat is the estimated logistic regression e&uationL
$ -s this Cogistic regression model usefulL Provide statistical evidence to support your
answer and where appropriate use a significance level of +/
c -nterpret the summary values of 1
;
and how many varia$les are there in the modelL
d 9%plain the coefficients of varia$les and state whether they ma*e $usiness senseL !hint8
Use profiler#
e Mas the fit !1
;
# improved compared to Decision 'reeL "hyL
f Create the confusion matri% for the testing data set !cutoff Pro$ 7 ,.+#
g "hat is the e%pected profit $ased on the confusion matri%
+ Build your own $est models to predict who will $uy NCele$rating American ArtsO party ware
using Cogistic regression and Decision 'ree
Data Science and Operations
Marshall School of Business
University of Southern California
? 2ind the estimated profit $ased on $est models Use the Profit Calculator ) 9%cel Sheet to
calculate profit
PART 1 Decision Tree Model(s)
N!te %& T#e 3ata #as $een c!(!e3 $ase3 !n $u:e an3 n!n-$u:e an3 3i=i3e3 int! tainin4 an3 testin4 3atasets8
T! ceate testin4 an3 tainin4 3ata set )!1 a; 3ata set5 e)e t! A,,en3i" %8
N!te 2& T#e ,!cess !) $ui(3in4 a 4!!3 1!3e( is (!n40 it in=!(=es t#e )!((!;in4 ste,s5
a8 Bui(3 a 3ecisi!n tee 1!3e( !n t#e tainin4 3ata set
$8 Use t#e 3ecisi!n tee t! ,e3ict t#e ,!,ensit:9,!$a$i(it:< !) a 1e1$e $u:in4 t#e ,!3uct an3 st!e it in
>M' as c!(u1ns 9)! $!t# tainin4 ? testin4 3ata set<
c8 Use t#e ,!,ensit: t! 3eci3e ;#! ;i(( $e 1ai(e3 t#e ,!3uct8
d S;itc# t! testin4 3ataset t! 4et c!n)usi!n 1ati"8
e8 Get C!n)usi!n 1ati"8
)8 Use t#e c!n)usi!n 1ati" t! )in3 !ut #!; 1an: 1e1$es ;ee sent t#e ,!3uct an3 #!; 1an: $!u4#t t#e
,!3uct
48 Use t#e c!n)usi!n 1ati" t! 4et t#e ,!)it esti1ate8
Ste, A& Bui(3 a 3ecisi!n tee 1!3e( !n t#e tainin4 3ata set 9)! t#e )ist %--- !;s !) 3ata<
. Open the SmartParty"arePCase;Hmp file in JMP, you should see the following file in JMP
; :ou should get a Screen li*e this
I Clic*,
Analy=e menu Modeling Partition
Data Science and Operations
Marshall School of Business
University of Southern California
@ 2or :, 1esponse, select Success< for K columns, select from 6ender, M, 1, 2, Movie 4ight !select all the predictors#
OQ
+ 'he following screen will show up
Data Science and Operations
Marshall School of Business
University of Southern California
? Clic* on the red triangle and at the upper left corner Display options show Split Pro$
A 'he following screen will show up, note the split pro$a$ilities are show in the decision tree
Data Science and Operations
Marshall School of Business
University of Southern California
Based on the a$ove printout, the percentage of $uyers in the .,,, training dataset is ,..; or ..;/
4ow we can $uild the decision tree using N-n)$uilt JMP algorithm or manuallyO , if you clic* the NSplitO repeatedly you will
$e $uilding the algorithm manually and if you clic* on 6o then JMP will $uild the decision tree for you
B Clic* on 6o and you will get the following decision tree 'he JMP algorithm finds the $est decision tree that will do
a good Ho$ on the training data set and testing data set $ased on the N1)s&uareO QP-, the decision tree algorithm may
not $e the $est choice for our o$Hective of ma%imi=ing the profit 'o ma%imi=e the profit you have to send as many
products as possi$le ! at most .,,,,,,# at the same time select mem$ers with high propensity
Data Science and Operations
Marshall School of Business
University of Southern California
i# Cet us understand the first split of the decision tree,
'he first split states that if you mail pac*ages to mem$ers with
a 1ecency of less than .? then the propensity to $uy the product is ,.@?G !.@?G/#
$ 1ecency of more than or e&ual to.? then the propensity to $uy the product is ,,@@I !@@I/#
ii# 4ow if we split the 1R.? group further, then we get the following groups,
a -f you mail the pac*ages to mem$ers with 1 R .? and Art Party S7. then the propensity to $uy the product is
,;B?A !;B?A/#
$ -f you mail the pac*ages to mem$ers with 1 R .? and Art Party R. then the propensity to $uy the product is
,.,BI !.,BI/#
4ote8 1R.? is a profita$le group, among the profita$le group we were a$le to find an unprofita$le segment
!ArtParty R.#
iii# 4ow if we split the 1R.? T Art Party R. group further, then we get the following groups,
Data Science and Operations
Marshall School of Business
University of Southern California
a -f you mail the pac*ages to mem$ers with 1 R .? and Art Party R . T 1ecency RB then the propensity to $uy
the product is ,.?GI !.?GI/#
$ -f you mail the pac*ages to mem$ers with 1 R .? and Art Party R . T 1ecency S7 B then the propensity to $uy
the product is ,,AII !AII/#
4ote8 1R.? T Art Party R. is an unprofita$le su$group, among the unprofita$le su$group we were a$le to find an
profita$le segment !1ecency RB#
iv# Cet us understand the $ottom $loc*s of the decision tree,
'here are + groups and we can get the additional information a$out the groups from the leaf diagram,
Data Science and Operations
Marshall School of Business
University of Southern California
v# Clic* on the red triangle and at the upper left corner Ceaf 1eport
:ou will get the following screen,
'he a$ove Ceaf 1eport gives you the propensity to $uy for the various groups !+ groups for this decision tree# 4ow you
have to decide which group you will select to mail the product
vi# "e *now the $asic response rate is ..;/, if you select .+/ as cut off then, these groups will $e selected for
mailing,

4ote8 'he num$er of mailing will $e !I? E .,A E .BG# 7 ;I; per thousand mem$ers, so appro%imately ;I;/ which is
,;I;D+,,,,,, 7 ..?,,,, which is more than .,,,,,,, $ut we will select the .,,,,,, out of ..?,,,, to mail
Data Science and Operations
Marshall School of Business
University of Southern California
vii# "e *now the $asic response rate is ..;/, if you select .G / as cut off then, these groups will $e selected for
mailing,

4ote8 'he num$er of mailing will $e !I? E .,A# 7 .@I per thousand mem$ers, so appro%imately .@I/ which is
,.@ID+,,,,, , 7 A.,+,, which is less than .,,,,,,
As you can see the higher the cutoff, lower num$er of mem$ers will $e sent the products
Our o$Hective is to find the ma%imum num$er of mem$ers !close to .,,,,,,# with high propensity to $uy
So, you can split further and find the groups with high propensity andUor play with the cutoff pro$a$ilities to select groups
Ste, B& Use t#e 3ecisi!n tee t! ,e3ict t#e ,!,ensit: 9,!$a$i(it:< !) a 1e1$e $u:in4 t#e
,!3uct an3 st!e it in >M' as c!(u1ns 9)! $!t# tainin4 ? testin4 3ata set<
. Clic* the red triangle at the upper left corner Save Columns Save Prediction 2ormula
4ote8 if you do Save Predicteds, it only saves values for the first .,,, rows -f you do Save Prediction 2ormula, it
saves values for all the rows
Data Science and Operations
Marshall School of Business
University of Southern California
; 'he following columns will $e created in the JMP file DD4ote8 it will not show up in the decision tree window DDD
'he main column is the Pro$!Success 77 .# column, it estimates the mem$ers propensity to $uy the product
Ste, C& Use t#e ,!,ensit: t! 3eci3e ;#! ;i(( $e 1ai(e3 t#e ,!3uct8
.# Create a new column named DecisionBuy !any name is o* F - selected DecisionBuy. to inform it is the first
algorithm - had $uilt to predict the $uyer using Decision 'ree#
Data Science and Operations
Marshall School of Business
University of Southern California
1ight clic* the empty column space 4ew Column DecisionBuy. !6ive a name for the new column# clic*
on Modeling type and change it to 4ominal Clic* OQ
A new Column NDecisionBuy.O is created
;# 1ight clic* DecisionBuy. column 2ormula
I# 4ew window opens,
Set the 2unctions as Conditional -f
Data Science and Operations
Marshall School of Business
University of Southern California
@# 'he following will show in the formula window,
4ote8 JMP restricts the formulas, you have to select the functions given in the formula window to create desired formulas
Our O$Hective is to create the following formula,
-f Pro$!Success 7 7.# S ,.+, then DecisionBuy. 7.
9lse, DecisionBuy. 7,
'his step involves steep learning curve, so practice it
+# 4ow Select Pro$!Success77.# from the ta$le column and clic* it 'he following window will show up
Data Science and Operations
Marshall School of Business
University of Southern California

?# 'he ne%t step is to compare the Pro$!Success7.#, Select the comparison on the function group shown a$ove and
select a S7 $ option, 'he following will show in the formula window,

'he following will show in the formula window,
Data Science and Operations
Marshall School of Business
University of Southern California
'he 1ed rectangle is the active window in the formula window, whatever you type will $e entered here,
A# 'ype in ,.+ !the propensity you have selected#, you will see the following on the formula screen,
B# Clic* on the Nthen clauseO window and type in .and Clic* on the Nelse clauseO window and type in , and
:ou will see the following on the formula screen,
G# 4ow clic* o* and the NDecisionBuy.O column will $e updated
Ste, D& S;itc# t! testin4 3ataset
.# Currently the first .,,, rows form the N'raining dataO for analysis we have to study the effectiveness of the
Decision tree algorithm on the N'esting dataO which is the $ottom .,,, rows "e will switch the dataset as follows,
;# Mighlight the rows from .,,. to ;,,,, $y left clic* on row .,,. and scrolling to ;,,, row, then right clic* to get the
menu given $elow, and then select 9%cludeUUne%clude option as show $elow,
I# 'he Mighlighted rows from .,,. to ;,,, will now $e Une%cluded as shown $elow,
Data Science and Operations
Marshall School of Business
University of Southern California
@# 4ow, highlight the rows from . to .,,,, $y left clic* on row . and scrolling to .,,, row, then right clic* to get the
menu given $elow, and then select 9%cludeUUne%clude option as show $elow,
+# 'he Mighlighted rows from . to .,,, will now $e 9%cluded as shown $elow,
4ow we have switched from N'raining Data setO to N'esting DatasetO
Ste, E& Get C!n)usi!n 1ati"
.# 6o to the Analy=e Menu and select 2it : $y K as shown $elow,
Data Science and Operations
Marshall School of Business
University of Southern California
;# A new screen will open up as follows,
4ow select DecisionBuy. column for :, 1esponse and clic*
4ow select Success Column for K and clic*
And Clic* OQ
I# :ou should have the following screen,
:, response
K, 2actor
Data Science and Operations
Marshall School of Business
University of Southern California
@# :ou can now ma*e the confusion matri% simpler $y clic*ing on the red triangle
and unselect the following
Unselect 'otal /, Col / and 1ow /
+# 4ow the result will loo* li*e
Data Science and Operations
Marshall School of Business
University of Southern California
4ote8 now you have got the confusion matri% for the testing dataset
Ste, F& Use t#e c!n)usi!n 1ati" t! )in3 !ut #!; 1an: 1e1$es ;ee sent t#e ,!3uct
an3 #!; 1an: $!u4#t t#e ,!3uct
.# 2rom the a$ove Confusion Matri%, we get the following information,
Based on the 'esting Data Set,
Cevel Count Percentage
Mem$ers Selected
for Mailing
I;@ I;@U.,,, 7
I;@/
'otal Mem$ers
Mailed $ased on
Algorithm
!I;@U.,,,# D
+,,,,,,
7 .?;,,,,
I;@/
Actual num$er of
Mem$ers Mailed
$ased on
1estriction
.,,,,,, .,,,,,
Pro$a$ility of
Buying for the
mailed mem$ers
7 !??UI;@# 7
;,IA,@ /
Pro$a$ility of
4on)Buying for
the mailed
mem$ers
77 !;+BUI;@#
7 AG?;G? /
Mar*eting department has suggested it is prudent to mail the Products only to ma%imum of .,,,,,, mem$ers
so the Product clu$ mem$ers do not $ecome tired of repeated mar*eting campaign
Cet us calculate the Profit for the selected Decision 'ree,
7 .,,,,,, D ,;,IA,@ D @++ E .,,,,,, D,AG?;G?D !)@# 7 ?,B,III
"e did not $eat the $ench mar* of >A,,,,,,, may$e we need to split further to reduce the mailing percentage
and increase the propensity of $uy
Data Science and Operations
Marshall School of Business
University of Southern California
PART 2 Logistic Regression Model(s)
a Build a Cogistic 1egression model on the training data set
$ Use the Cogistic 1egression to predict the propensity!pro$a$ility# of a mem$er $uying the product and store it in
JMP as columns !for $oth training T testing data set# F 'hese two steps are similar to regression
'he following steps are similar to Part.
c Use the propensity to decide who will $e mailed the product
d Switch to testing dataset to get confusion matri%
e 6et Confusion matri%
f Use the confusion matri% to find out how many mem$ers were sent the product and how many $ought the product
g Use the confusion matri% to get the profit estimate
.# Open the SmartParty"arePCase;Hmp file in JMP, you should see the following file in JMP !- am starting from the
$eginning from the original data set#
;# Cet us do the logistic 1egression Analysis, go to Analy=e and clic* on 2it Model
I# 'he following screen will show up< Select NSuccessO on the Select Columns and then clic* on N:O under Pic* 1ole
3aria$les Select the K varia$les N6ender ) Movie4ightO and clic* on Add under Construct Model 9ffects, then
under Personality select the NStepwiseO option :ou will see the following screen with information filled as shown
$elow, then clic* 1un
Data Science and Operations
Marshall School of Business
University of Southern California
@# 'he following screen will show up< the stepwise gives you three options, forward, $ac*ward or mi%ed, we will use
2orward option -f you are using the forward option then none of the parameters should $e entered, if you are using
the $ac*ward option then all of the parameters should $e entered
Data Science and Operations
Marshall School of Business
University of Southern California
+# Clic* on 6o $utton and you will get the following window, According to NStepwise 1egressionO the most important
varia$le is Art PartyO followed $y N1 !1ecency#O etc 'he $est Model consists of N?O parameters !+ varia$les plus
interceptO 4ow clic* on Ma*e Model to select the $est model
Data Science and Operations
Marshall School of Business
University of Southern California
?# 'he following screen will show up< According to NStepwise 1egressionO the most important varia$le is Art PartyO
followed $y N1 !1ecency#O etc 4ow clic* on run
A# 'he following screen will show up< this is the Multiple 1egression model, - have clic*ed on Prediction Profiler to
show the propensity relationship
Data Science and Operations
Marshall School of Business
University of Southern California
Data Science and Operations
Marshall School of Business
University of Southern California
Ste, B& Use t#e L!4istic Re4essi!n t! ,e3ict t#e ,!,ensit: 9,!$a$i(it:< !) a 1e1$e $u:in4 t#e
,!3uct an3 st!e it in >M' as c!(u1ns 9)! $!t# tainin4 ? testin4 3ata set<
. Clic* the red triangle at the upper left corner Save Pro$a$ility 2ormula
; 'he following columns will $e created in the JMP file DD4ote8 it will not show up in the Cogistic 'ree window
DDD
'he main column is the Pro$V .W column, it estimates the mem$er(s propensity to $uy the product
Ste, C& Use t#e ,!,ensit: t! 3eci3e ;#! ;i(( $e 1ai(e3 t#e ,!3uct8 9Si1i(a t! 'at%<
.# Create a new column named CogisticBuy. !any name is o* F - selected CogisticBuy. to inform it is the first
algorithm - had $uilt to predict the $uyer using Cogistic 1egression#
;# 1ight clic* the empty column space 4ew Column CogisticBuy. !6ive a name for the new column#
clic* on Modeling type and change it to 4ominal Clic* OQ
Data Science and Operations
Marshall School of Business
University of Southern California
A new Column NCogisticBuy.O is created
I# 1ight clic* CogisticBuy.column 2ormula
@# 4ew window opens,
Set the 2unctions as Conditional -f
+# 'he following will show in the formula window,
4ote8 JMP restricts the formulas, you have to select the functions given in the formula window to create desired formulas
Data Science and Operations
Marshall School of Business
University of Southern California
Our O$Hective is to create the following formula,
-f Pro$!Success 7 7.# S ,.+, then CogisticBuy. 7.
9lse, CogisticBuy. 7,
'his step involves steep learning curve, so practice it
?# 4ow Select Pro$V.W from the ta$le column and clic* it

A# 'he ne%t step is to compare the Pro$V.W, Select the comparison on the function group shown a$ove and select
a S7 $ option, 'he following will show in the formula window,

'he 1ed rectangle is the active window in the formula window, whatever you type will $e entered here,
B# 'ype in ,.+ !the propensity you have selected#, you will see the following on the formula screen,
G# Clic* on the Nthen clauseO window and type in .and Clic* on the Nelse clauseO window and type in , and
you will see the following on the formula screen,
.,# 4ow clic* o* and the NCogistic Buy.O column will $e updated
Ste, D& S;itc# t! testin4 3ataset @ Sa1e as 'at%
Data Science and Operations
Marshall School of Business
University of Southern California
Ste, E& Get C!n)usi!n 1ati" @ Sa1e as 'at%
Ste, F& Use t#e c!n)usi!n 1ati" t! )in3 !ut #!; 1an: 1e1$es ;ee sent t#e ,!3uct
an3 #!; 1an: $!u4#t t#e ,!3uct
Cet us calculate the Profit for the selected Cogistic 1egression,
7 .,,,,,, D !??U;+,# D @++ E .,,,,,, D!.B@U;+,# D !)@# 7 > G,?,B,,
"e did $eat the $ench mar* of >A,,,,,,, Can we do $etterL
'ry other varia$les in the logistic regression or have higher cutoff value for Sending pac*ages, may $e use ,;
instead of ,.+
Appendix 1
%8 T! !,en a )i(e in >M'
NJMP O N2ileO menu NOpenO Cocate NJMPO and clic* NOpenO SmartParty"arePCase;Hmp
28 T! c!(! t#e !;s
Clic* the triangular part $elow the diagonal of the corner cell of the data form NColor or Mar* $y ColumnO Select
NSuccessO OQ
Data Science and Operations
Marshall School of Business
University of Southern California
I T! ceate t#e tainin4 3ata set an3 test 3ata set
.# Create a new column named NrandomO
Mere is how to create a new column8 a# 1ight clic* the empty column space
$# N4ew ColumnO< c# 6ive a name to the column, N1andomO, and clic* OQ
;# 1ight clic* the N1andomO column Select N2ormulaO Set N2unctionsO as N1andom))))S 1andom UniformO
OQ
Data Science and Operations
Marshall School of Business
University of Southern California
I# N'a$lesO menu NSortO By N1andomO NSortO O*
4ote8 now a new sorted file is created "e can treat the first .,,, data points in the sorted file as the training data set
Data Science and Operations
Marshall School of Business
University of Southern California
@# 6o $ac* to the sorted file Select rows .,,. to ;,,, 1ight clic* the selected rows and choose
N9%cludeUUne%cludeO
Data Science and Operations
Marshall School of Business
University of Southern California
Data Science and Operations
Marshall School of Business
University of Southern California
@# Save the file as SmartParty"arePCase;Hmp
+# "e have the training and testing data set in one single file "e can switch $etween the two $y e%cluding the first
.,,, and Une%cluding the $ottom .,,,

Você também pode gostar