Escolar Documentos
Profissional Documentos
Cultura Documentos
Charts and examples in this slides came from Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Topics covered
Discriminant Analysis
Discriminant Analysis Technique and Its Assumptions
Discriminant Analysis Model
How the Parameters of the Model be Estimated?
How to Predict the Group in DV?
Test the Signicance of IVs
Interpretation of the Coefcients
Compare Relative Impact of the Different IV on the DV
Goodness of Fit of the Model
Dependent Variable with Three Groups
Logistic Regression
Logistic Regression
A dependency technique
The dependent variable (Y) is binary
The independent variables (Xk) can be metric and/
or nonmetric
Technique used to predict a binary dependent
variable from one or more metric and/or nonmetric
independent variables
Assumptions
More robust and requires fewer assumptions than
multiple regression and discriminant analysis
Linearity and multivariate normality among Xk are not
necessary but they will increase power
Requires a substantial number of cases relative to
the number of Xk, particularly nonmetric Xk
(suggested number is 30 to 1 or more)
No multicollinearity
No outliers
The Logistical Model
The Logistical Model
P ro b ( e v e n t) = 1
-Z
(1 + e )
If bk is positive (+), ebk will be greater than 1, and the odds of the
event will increase
If bk is negative (-), ebk will be less than 1, and the odds of the event
will decrease
If bk is zero (0.0), ebk will equal 1, and the odds of the event will
remain unchanged
Questions Answered by Logistical Regression
Questions Answered by Logistical Regression
To what extent does each predictor variable
contribute to the probability of a case being in one
group or the other of the dependent variable?
How well does the model predict or explain group
membership in the binary dependent variable?
What is the probability that a particular case is in one
group or the other of the dependent variable?
The Way to Estimate Parameters of a Logistical
Regression Equation
Maximum likelihood estimation is used to estimate
the parameters a, b1, b2, + ... + bk
-Z
(1 + e )
= log [P / (1 - P)]
Prob (event) = the probability that a case is a member of the category of the
dependent variable that is coded 1 (retained counsel)
Data for Analysis
P ro b ( e v e n t) = 1
-Z
(1 + e )
P ro b ( e v e n t) = 1
- 0 .7 3 2
(1 + e )
bk = logistical coefcient
SEbk= standard error of the logistical coefcient
Null hypothesis: k in the population = 0.0
Expected Change in the Odds Ratio
When Xk changes by 1 unit, the odds ratio will change by Exp (b ).k
P ro b ( e v e n t) = 1
- Z
(1 + e )
The Cox&Snell Rcs2 can not equal 1.0, even the model
perfectly ts the data.
Negelkerke Rn2 is the modication of Rcs2 that can equal
1.0 if the model is a perfect t.
Classication Table
What percent of the cases were predicted correctly?
What percent incorrectly?
Function
1
DR_SCORE .235
SER_INDX .564
(Constant) .706
Unstandardizedcoef f icients
Sig.ofFto Wilks'
Step Tolerance Remove Lambda
1 SER_INDX 1.000 .000
2 SER_INDX .864 .000 .983
DR_SCORE .864 .019 .832
C k = Wk (Xk - Xk)2 / (N - g)
Wk = the unstandardized discriminant coefcient of variable k
(Xk - Xk)2 = SS of the predictor variable
N = total sample size
g = number of DV groups
Calculating Standardized Discriminant Coefcients
CanonicalDiscriminantFunctionCoefficients
Function
1
DR_SCORE .235
SER_INDX .564
(Constant) .706
Unstandardizedcoefficients
StandardizedCanonicalDiscriminantFunctionCoef f icients
Function
1
DR _SC O RE .625
S E R _ IN D X 1.044
Canonical
Function Eigenvalue %ofVariance Cumulative% Correlation
1 a
.305 100.0 100.0 .483
a.
First1canonicaldiscriminantfunctionswereusedinthe
analysis.
Eigenvalues ()
Eigenvalues
Canonical
Function Eigenvalue %ofVariance Cumulative% Correlation
1 a
.305 100.0 100.0 .483
a.
= 0.305
First1canonicaldiscriminantfunctionswereusedinthe
analysis.
Wilks'
T e s t o f F u n c t i o n ( s ) Lambda Chisquare df Sig.
1 .766 17.837 2 .000
Function
1 2
AGE .146 .253
COUNSEL 1.946 1.682
(Constant) 2.375 6.655
Function
1 2
COUNSEL .867 * .499
a
AGE_FIRS .291 * .067
PR_ARRST a
.219 * .190
AGE .654 .757 *
DR_SCORE a
.109 .195 *
Pooledwithingroupscorrelationsbetweendiscriminating
variablesandstandardizedcanonicaldiscriminantfunctions
Variablesorderedbyabsolutesizeofcorrelationwithinfunction.
*.
Largestabsolutecorrelationbetweeneachvariableand
anydiscriminantfunction
a.
Thisvariablenotusedintheanalysis.