Escolar Documentos
Profissional Documentos
Cultura Documentos
Slide 1
Outliers
Multicollinearity
Validation
Sample problem
Slide 2
We will use the script for testing for normality and test substituting
the log, square root, or inverse transformation when they induce
normality in a variable that fails to satisfy the criteria for normality.
Slide 3
Slide 4
If we reject the null hypothesis and conclude that the variances are
heterogeneous, we substitute separate covariance matrices in the
classification, and evaluate whether or not our classificaiton accuracy
is improved.
SW388R7
Slide 5
Slide 6
According to the SPSS Base 10.0 Applications Guide, page 259, "cases
with large values of Mahalanobis distance from their group mean can
be identified as outliers."
Slide 7
Slide 8
Slide 9
Multicollinearity
Data Analysis &
Computers II
Slide 10
Validation
Data Analysis &
Computers II
Slide 11
Slide 12
Problem 1
Data Analysis &
Computers II
Slide 13
In the dataset GSS2000.sav, is the following statement true, false, or an incorrect application of
a statistic? Assume that there is no problem with missing data. Use a level of significance of 0.01
for evaluating assumptions. Use a level of significance of 0.05 for evaluating the statistical
relationship.
From the list of variables "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], "highest year of school completed" [educ], and "income" [rincom98], the most useful
predictors for distinguishing among groups based on responses to "opinion about spending on
welfare" [natfare] are "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], and "highest year of school completed" [educ]. These predictors differentiate survey
respondents who thought we spend too much money on welfare from survey respondents who
thought we spend about the right amount of money on welfare who, in turn, are differentiated
from survey respondents who thought we spend too little money on welfare.
The most important predictor of groups based on responses to opinion about spending on welfare
was number of hours worked in the past week. The second most important predictor of groups
based on responses to opinion about spending on welfare was self-employment. The third most
important predictor of groups based on responses to opinion about spending on welfare was
highest year of school completed.
Survey respondents who thought we spend about the right amount of money on welfare worked
fewer hours in the past week than survey respondents who thought we spend too much or little
money on welfare. Survey respondents who thought we spend about the right amount of money
on welfare had completed more years of school than survey respondents who thought we spend
too much or little money on welfare. Survey respondents who thought we spend too much money
on welfare were more likely to be self-employed than survey respondents who thought we spend
too little money on welfare.
1. True
2. True with caution
3. False
4. Inappropriate application of a statistic
SW388R7
Dissecting problem 1 - 1
Data Analysis &
Computers II
Slide 14
In the dataset GSS2000.sav, is the following statement true, false, or an incorrect application
of a statistic? Assume that there is no problem with missing data. Use a level of significance of
0.01 for evaluating assumptions. Use a level of significance of 0.05 for evaluating the statistical
relationship.
From the list of variables "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], "highest year of school completed" [educ], and "income" [rincom98], the most useful
predictors for distinguishing among groups based on responses to "opinion about spending on
welfare" [natfare] are "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], and "highest year of school completed" [educ]. These predictors differentiate survey
respondents who thought we spend too much money on welfare from survey respondents who
thought we spend about the right amount of money on welfare who, in turn, are differentiated
from survey respondents who thought we spend too little money on welfare.
SW388R7
Dissecting problem 1 - 2
Data Analysis &
Computers II
Slide 15
From the list of variables "number of hours worked in the past week" [hrs1], "self-
employment" [wrkslf], "highest year of school completed" [educ], and "income" [rincom98],
the most useful predictors for distinguishing among groups based on responses to "opinion
about spending on welfare" [natfare] are "number of hours worked in the past week" [hrs1],
"self-employment" [wrkslf], and "highest year of school completed" [educ]. These predictors
differentiate survey respondents who thought we spend too much money on welfare from survey
respondents who thought we spend about the right amount of money on welfare who, in turn,
areThe
differentiated from
variable used survey respondents who thought we spend too little money on welfare.
to define
groups is the dependent When a problem asks us
variable (DV): "opinion about to identify the bestonorwelfare worked
Survey respondents
spending who thought we spend about the right amount
on welfare" of money
fewer most useful predictors
hours in the past week than survey respondents who thought we spend too much or little
[natfare]. from a list of
money on welfare. Survey respondents who thought we spend about the right amount of money
on welfare had completed more years of school than surveyindependent
respondentsvariables,
who thought we spend
we do stepwise
too much or little money on welfare. Survey respondents who thought we spend too much
discriminant analysis.
money on welfare were more likely to be self-employed than survey respondents who thought
we spend too little money on welfare.
SW388R7
Dissecting problem 1 - 3
Data Analysis &
Computers II
Slide 16
In the dataset GSS2000.sav, is the following statement true, false, or an incorrect application
of a statistic? Assume that there is no problem with missing data. Use a level of significance of
0.01 for evaluating assumptions. Use a level of significance of 0.05 for evaluating the
statistical relationship.
From the list of variables "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], "highest year of school completed" [educ], and "income" [rincom98], the most useful
predictors for distinguishing among groups based on responses to "opinion about spending on
welfare" [natfare] are "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], and "highest year of school completed" [educ]. These predictors differentiate
survey respondents who thought we spend too much money on welfare from survey
respondents who thought we spend about the right amount of money on welfare who, in
turn, are differentiated from survey respondents who thought we spend too little money
on welfare.
SW388R7
Dissecting problem 1 - 4
Data Analysis &
Computers II
Slide 17
In the dataset GSS2000.sav, is the following statement true, false, or an incorrect application of a statistic? Assume that there is
no problem with missing data. Use a level of significance of 0.01 for evaluating assumptions. Use a level of significance of 0.05
for evaluating the statistical relationship.
In a stepwise analysis, we only
From the list of variables "number of hours worked in the past week" [hrs1], "self-employment" [wrkslf], "highest year of school
interpret[educ],
completed" the independent
and "income" [rincom98], the most useful predictors for distinguishing among groups based on responses to
variables
"opinion aboutthat are on
spending entered
welfare" in
[natfare] are "number of hours worked in the past week" [hrs1], "self-employment" [wrkslf],
and "highest year of school completed" [educ]. These predictors differentiate survey respondents who thought we spend too much
the stepwise
money analysis.
on welfare from survey respondents who thought we spend about the right amount of money on welfare who, in turn, are
differentiated from survey respondents who thought we spend too little money on welfare.
The importance of individual
The most important predictor of groups based on responses to opinion about spending on welfare was number of hours
predictors
worked in the past week. The second most important predictor of groups based onisresponses
based on order about spending on
to opinion
welfare was self-employment. The third most important predictor of of entry
groups in the
based analysis.
on responses to opinion about spending
on welfare was highest year of school completed.
Survey respondents who thought we spend about the right amount of money on welfare worked fewer hours in the past week than
survey respondents who thought we spend too much or little money on welfare. Survey respondents who thought we spend about
the right amount of money on welfare had completed more years of school than survey respondents who thought we spend too
much or little money on welfare. Survey respondents who thought we spend too much money on welfare were more likely to be
self-employed than survey respondents who thought we spend too little money on welfare.
1. True
2. True with caution
3. False
4. Inappropriate application of a statistic
SW388R7
Dissecting problem 1 - 5
Data Analysis &
Computers II
Slide 18
From the list of variables "number of hours worked in the past week" [hrs1], "self-employment" [wrkslf], "highest year of
school completed" [educ], and "income" [rincom98], the most useful predictors for distinguishing among groups based on
responses to "opinion about spending on welfare" [natfare] are "number of hours worked in the past week" [hrs1], "self-
employment" [wrkslf], and "highest year of school completed" [educ]. These predictors differentiate survey respondents
Thethought
who specific
werelationships
spend too muchlisted
moneyinonthe problem
welfare from survey respondents who thought we spend about the right
amount
indicate how the independent variable differentiated
of money on welfare who, in turn, are relates from survey respondents who thought we spend too little
money on welfare.
to groups of the dependent variable, e.g., the
The mostfor
mean important predictor in
hours worked of the
groups based
past weekon responses
will to opinion about spending on welfare was number of hours
worked in the past week. The second most important predictor of groups based on responses to opinion about spending on
be lower
welfare wasfor respondentsThe
self-employment. who think
third mostwe spend predictor of groups based on responses to opinion about spending
important
the
on right was
welfare amount
highestofyear
money versus
of school respondents
completed.
who think
Survey we spend
respondents who too much
thought weor too about
spend little.the right amount of money on welfare worked fewer hours in the
past week than survey respondents who thought we spend too much or little money on welfare. Survey respondents
who thought we spend about the right amount of money on welfare had completed more years of school than
survey respondents who thought we spend too much or little money on welfare. Survey respondents who thought
we spend too much money on welfare were more likely to be self-employed than survey respondents who thought
we spend too little money on welfare.
1. True
2. True with caution
3. False
4. Inappropriate application of a statistic
LEVEL OF MEASUREMENT - 1
Data Analysis &
Computers II
Slide 19
In the dataset GSS2000.sav, is the following statement true, false, or an incorrect application of a
statistic? Assume that there is no problem with missing data. Use a level of significance of 0.01 for
evaluating assumptions. Use a level of significance of 0.05 for evaluating the statistical relationship.
From the list of variables "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], "highest year of school completed" [educ], and "income" [rincom98], the most useful
predictors for distinguishing among groups based on responses to "opinion about spending on
welfare" [natfare] are "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], and "highest year of school completed" [educ]. These predictors differentiate survey
respondents who thought we spend too much money on welfare from survey respondents who
thought we spend about the right amount of money on welfare who, in turn, are differentiated
from survey respondents who thought we spend too little money on welfare.
Survey respondents who thought we spend about the right amount of money on welfare worked fewer
hours in the past week than survey respondents who thought we spend too much or little money on
Discriminant
welfare. Survey respondents analysis
who thought requires
we spend thatthe
about theright
dependent
amount of money on welfare had
completed more years of school than survey respondents who thoughtvariables
variable be non-metric and the independent we spend too much or little
be metric or dichotomous. "Opinion about spending
money on welfare. Surveyon welfare" [natfare] is an ordinal level variable,money on welfare were more
respondents who thought we spend too much
likely to be self-employed thansatisfies
which survey respondents who thought we spend too little money on welfare.
the level of measurement
requirement.
LEVEL OF MEASUREMENT - 2
Data Analysis &
Computers II
Slide 20
From the list of variables "number of hours worked in the past week" [hrs1], "self-
employment" [wrkslf], "highest year of school completed" [educ], and "income" [rincom98],
the most useful predictors for distinguishing among groups based on responses to "opinion
about spending on welfare" [natfare] are "number of hours worked in the past week"
[hrs1], "self-employment" [wrkslf], and "highest year of school completed" [educ]. These
predictors differentiate survey respondents who thought we spend too much money on
welfare from survey respondents who thought we spend about the right amount of money
on welfare who, in turn, are differentiated from survey respondents who thought we spend
too little money on welfare.
Survey respondents who thought we spend about the right amount of money on welfare worked
fewer hours
"Number in the
of hours past week
worked in thethan survey respondents who thought we spend too much or little
money
past week"on [hrs1]
welfare.
andSurvey respondents who thought we spend about the right amount of money
"highest
year of schoolhad
on welfare completed"
completed[educ]
more years of school than survey respondents who thought we
are interval level variables, which
spend too much or little money on welfare. Survey respondents who thought we spend too
satisfies the level of measurement
much moneyfor
requirements ondiscriminant
welfare were more likely to be self-employed than survey
"Income" [rincom98] is anrespondents
ordinal level who
thought we spend too little money on welfare. variable. If we follow the convention of
analysis.
treating ordinal level variables as metric
variables, the level of measurement
requirement for discriminant analysis is
satisfied. Since some data analysts do
not agree with this convention, a note
"Self-employment" [wrkslf] is a of caution should be included in our
dichotomous or dummy-coded interpretation.
nominal variable which may be
included in discriminant analysis.
SW388R7
Slide 21
Slide 22
Slide 23
Slide 24
Second, type in
3 in the Third, click on the
Maximum text Continue button to
box. close the dialog box.
Slide 25
Slide 26
Slide 27
Slide 28
Slide 29
Slide 30
Slide 31
Slide 32
Slide 33
Slide 34
Click on the OK
button to request the
output for the
disciminant analysis.
SW388R7
Slide 35
Slide 36
Slide 37
Classification Resultsb,c
ASSUMPTION OF NORMALITY
Data Analysis &
Computers II
Slide 38
Second, click on
the OK button to
produce the output.
SW388R7
Data Analysis & Normality of independent variable:
highest year of school completed
Computers II
Slide 39
Descriptives
Slide 40
Slide 41
Descriptives
Slide 42
Descriptives
OUTLIERS
Data Analysis &
Computers II
Slide 43
Slide 44
Slide 45
Min. D Squared
Slide 46
Slide 47
Third, we click
on the up
arrow button to
move the
function to the
Numeric
Second, we scroll down the
Expression
list of SPSS function to
textbox.
highlight the one we need:
IDF.CHISQ(p, df)
SW388R7
Slide 48
Slide 49
Slide 50
Identifying outliers
Data Analysis &
Computers II
Slide 51
Slide 52
Slide 53
Slide 54
Slide 55
Slide 56
To complete the
request, we click on
the OK button.
SW388R7
Slide 57
Slide 58
Classification Resultsb,c
Slide 59
Classification Resultsb,c
SAMPLE SIZE - 1
Data Analysis &
Computers II
Slide 60
SAMPLE SIZE - 2
Data Analysis &
Computers II
Slide 61
Slide 62
Slide 63
MULTICOLLINEARITY
Data Analysis &
Computers II
Slide 64
Slide 65
Function 2 separates
Functions at Group Centroids survey respondents
who thought we spend
Function too little money on
WELFARE 1 2 welfare (positive value
of 0.235) from survey
1 -.220 .235 respondents who
2 .446 -.031 thought we spend too
3 -.311 -.362 much money (negative
value of -0.362) on
Unstandardized canonical discriminant welfare. We ignore the
functions evaluated at group means second group (-0.031)
Function 1 separates survey respondents in this comparison
who thought we spend about the right because it was
amount of money on welfare (the positive distinguished from the
value of 0.446) from survey respondents other two groups by
who thought we spend too much (negative function 1.
value of -0.311) or little money (negative
value of -0.220) on welfare.
SW388R7
Data Analysis & Independent variables and group membership:
which predictors to interpret
Computers II
Slide 66
Variables Entered/Removeda,b,c,d
Min. D Squared
Between Exact F
Step Entered Statistic Groups Statistic df1 df2 Sig.
1 NUMBER When we use the stepwise method of
OF variable inclusion, we limit our interpretation
HOURS of independent variable predictors to those
.023 1 and 3listed as statistically
.475 1 135.000
significant .492
in the table
WORKED
LAST of Variables Entered/Removed.
WEEK
We will interpret the impact on membership
2 R in groups defined by the dependent variable
SELF-EM by the independent variables:
P OR •number of hours worked in the past week
WORKS .251 1 and 2 •self-employment.
3.289 2 134.000 .040
FOR •highest year of school completed
SOMEBO
DY
3 HIGHEST
YEAR OF
SCHOOL .364 1 and 3 2.433 3 133.000 .068
COMPLE
Had we use simultaneous
TED entry of all variables, we
wouldbetween
At each step, the variable that maximizes the Mahalanobis distance not have imposed
the two closest this
groups is entered. limitation.
a. Maximum number of steps is 8.
b. Maximum significance of F to enter is .05.
c.
SW388R7
Data Analysis & Independent variables and group membership:
predictor loadings on functions
Computers II
Slide 67
Slide 68
Group Statistics
Valid N (listwise)
WELFARE Mean Std. Deviation Unweighted Weighted
1 TOO LITTLE NUMBER OF HOURS The average number of hours worked
43.96 13.240in the past56week 56.000
for survey
WORKED LAST WEEK
HIGHEST YEAR OF respondents who thought we spend
13.73 2.401about the 56
right amount
56.000 of money on
SCHOOL COMPLETED
welfare (mean=37.90) was lower
R SELF-EMP OR WORKS
1.93 .260than the average
56 number of hours
56.000
FOR SOMEBODY worked in the past weeks for survey
RESPONDENTS INCOME 13.70 5.034respondents
56 who56.000
thought we spend
2 ABOUT RIGHT NUMBER OF HOURS too little money on welfare
37.90 13.235(mean=43.96)
50 50.000
and survey
WORKED LAST WEEK
HIGHEST YEAR OF respondents who thought we spend
14.78 2.558too much money
50 on welfare
50.000
SCHOOL COMPLETED
(mean=42.03).
R SELF-EMP OR WORKS
1.90 .303 50 50.000
FOR SOMEBODY This supports the relationship that
RESPONDENTS INCOME 14.00 5.503"survey respondents
50 50.000who thought we
3 TOO MUCH NUMBER OF HOURS spend about the right amount of
42.03 10.456money on 32 32.000
welfare worked fewer
WORKED LAST WEEK
HIGHEST YEAR OF hours in the past week than survey
13.38 2.524respondents 32 who32.000
thought we spend
SCHOOL COMPLETED
too little or much money on welfare."
R SELF-EMP OR WORKS
1.75 .440 32 32.000
FOR SOMEBODY
RESPONDENTS INCOME 14.75 5.304 32 32.000
Total NUMBER OF HOURS
41.32 12.846 138 138.000
WORKED LAST WEEK
HIGHEST YEAR OF
14.03 2.537 138 138.000
SCHOOL COMPLETED
R SELF-EMP OR WORKS
SW388R7
Data Analysis & Independent variables and group membership:
predictors associated with first function - 2
Computers II
Slide 69
Group Statistics
Valid N (listwise)
WELFARE Mean Std. Deviation Unweighted Weighted
1 TOO LITTLE NUMBER OF HOURS
43.96 13.240The average
56 highest
56.000year of school
WORKED LAST WEEK
completed for survey respondents
2.401who thought
56 we 56.000
spend about the
HIGHEST YEAR OF
13.73
SCHOOL COMPLETED right amount of money on welfare
R SELF-EMP OR WORKS (mean=14.78) was higher than the
1.93 .260average highest
56 56.000
year of school
FOR SOMEBODY
RESPONDENTS INCOME 13.70 5.034completeds56 for survey
56.000 respondents
who thought we spend too little
2 ABOUT RIGHT NUMBER OF HOURS
37.90 13.235money on 50welfare (mean=13.73) and
50.000
WORKED LAST WEEK survey respondents who thought we
HIGHEST YEAR OF
14.78 2.558
spend too 50
much 50.000
money on welfare
SCHOOL COMPLETED (mean=13.38).
R SELF-EMP OR WORKS
1.90 .303This supports
50 the50.000
relationship that
FOR SOMEBODY
RESPONDENTS INCOME 14.00 5.503
"survey respondents
50
who thought we
50.000
spend about the right amount of
3 TOO MUCH NUMBER OF HOURS
42.03 10.456money on 32 welfare had completed
32.000
WORKED LAST WEEK more years of school than survey
HIGHEST YEAR OF respondents who thought we spend
13.38 2.524 32 32.000
SCHOOL COMPLETED too little or much money on welfare."
R SELF-EMP OR WORKS
1.75 .440 32 32.000
FOR SOMEBODY
RESPONDENTS INCOME 14.75 5.304 32 32.000
Total NUMBER OF HOURS
41.32 12.846 138 138.000
WORKED LAST WEEK
HIGHEST YEAR OF
14.03 2.537 138 138.000
SCHOOL COMPLETED
R SELF-EMP OR WORKS
SW388R7
Data Analysis & Independent variables and group membership:
predictors associated with second function
Computers II
Slide 70
Group Statistics
Valid N (listwise)
WELFARE Mean Std. Deviation Unweighted Weighted
1 TOO LITTLE NUMBER OF HOURS Since self-employment is a dichotomous
43.96 13.240 variable, the
56 mean
56.000
is not directly
WORKED LAST WEEK
HIGHEST YEAR OF interpretable. Its interpretation must
13.73 2.401 take into 56
account the coding by which 1
56.000
SCHOOL COMPLETED
corresponds to self-employed and 2
R SELF-EMP OR WORKS
1.93 .260 corresponds
56 to someone
56.000 else. The lower
FOR SOMEBODY mean for survey respondents who
RESPONDENTS INCOME 13.70 5.034 thought we
56 spend too much money on
56.000
2 ABOUT RIGHT NUMBER OF HOURS welfare (mean=1.75), when compared
37.90 13.235 to the mean
50 for 50.000
survey respondents who
WORKED LAST WEEK
HIGHEST YEAR OF
thought we spend too little money on
14.78 2.558 welfare (mean=1.93),
50 50.000 implies that the
SCHOOL COMPLETED
group contained more survey
R SELF-EMP OR WORKS
1.90 .303 respondents
50 who were self-employed
50.000
FOR SOMEBODY and fewer survey respondents who were
RESPONDENTS INCOME 14.00 5.503 working for
50 someone
50.000 else.
3 TOO MUCH NUMBER OF HOURS
42.03 10.456 This supports
32 the relationship that
32.000
WORKED LAST WEEK
"survey respondents who thought we
HIGHEST YEAR OF
13.38 2.524 spend too32much32.000
money on welfare were
SCHOOL COMPLETED more likely to be self-employed than
.440 survey respondents who thought we
R SELF-EMP OR WORKS
1.75 32 32.000
FOR SOMEBODY spend too little money on welfare."
RESPONDENTS INCOME 14.75 5.304 32 32.000
Total NUMBER OF HOURS
41.32 12.846 138 138.000
WORKED LAST WEEK
HIGHEST YEAR OF
14.03 2.537 138 138.000
SCHOOL COMPLETED
R SELF-EMP OR WORKS
SW388R7
Data Analysis & ASSUMPTION OF EQUAL DISPERSION FOR
DEPENDENT VARIABLE GROUPS
Computers II
Slide 71
Slide 72
Slide 73
Slide 74
Slide 75
Classification Resultsb,c
Slide 76
From the list of variables "number of hours worked in the past week" [hrs1], "self-
employment" [wrkslf], "highest year of school completed" [educ], and "income" [rincom98],
the most useful predictors for distinguishing among groups based on responses to "opinion
about spending on welfare" [natfare] are "number of hours worked in the past week" [hrs1],
"self-employment" [wrkslf], and "highest year of school completed" [educ]. These predictors
differentiate survey respondents who thought we spend too much money on welfare from survey
respondents who thought we spend about the right amount of money on welfare who, in turn,
are differentiated fromThe stepwise
survey discriminant
respondents who analysis
thought we spend too little money on welfare.
included the three variables identified
as the most useful predictors.
The most important predictor of groups based on responses to opinion about spending on
welfare was number of hours worked in the past week. The second most important predictor of
groups based on responses to opinion about spending on welfare was self-employment. The third
most important predictor of groups based on responses to opinion about spending on welfare
was highest year of school completed.
Survey respondents who thought we spend about the right amount of money on welfare worked
fewer hours in the past week than survey respondents who thought we spend too much or little
money on welfare. Survey respondents who thought we spend about the right amount of money
on welfare had completed more years of school than survey respondents who thought we spend
too much or little money on welfare. Survey respondents who thought we spend too much
money on welfare were more likely to be self-employed than survey respondents who thought
we spend too little money on welfare.
SW388R7
Slide 77
From the list of variables "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], "highest year of school completed" [educ], and "income" [rincom98], the most useful
predictors for distinguishing among groups based on responses to "opinion about spending on
welfare" [natfare] are "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], and "highest year of school completed" [educ]. These predictors differentiate survey
respondents who thought we spend too much money on welfare from survey respondents
who thought we spend about the right amount of money on welfare who, in turn, are
differentiated from survey respondents who thought we spend too little money on welfare.
The most important predictor of groups based on responses to opinion about spending on
welfare was number We
of hours
found worked in the past
two statistically week. The second most important predictor of
significant
groups based on responses to opinion
discriminant aboutmaking
functions, spending on welfare
it possible to was self-employment. The
third most importantdistinguish
predictor among
of groupsthebased
three on responses
groups defined to opinion about spending on
welfare was highest by theofdependent
year variable.
school completed.
Moreover, the cross-validated classification
Survey respondents who thought
accuracy we spend
surpassed the about the right
by chance amount of money on welfare worked
accuracy
criteria, supporting the utility of the model.
fewer hours in the past week than survey respondents who thought we spend too much or little
money on welfare. Survey respondents who thought we spend about the right amount of money
on welfare had completed more years of school than survey respondents who thought we
spend too much or little money on welfare. Survey respondents who thought we spend too
much money on welfare were more likely to be self-employed than survey respondents who
thought we spend too little money on welfare.
SW388R7
Slide 78
From the list of variables "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], "highest year of school completed" [educ], and "income" [rincom98], the most useful
predictors for distinguishing among groups based on responses to "opinion about spending on
welfare" [natfare] are "number of hours worked in the past week" [hrs1], "self-employment"
[wrkslf], and "highest year of school completed" [educ]. These predictors differentiate survey
respondents who thought we spend too much money on welfare from survey respondents who
The order of importance matched
thought we spend about the right theamount
order ofofentry
money on table
in the welfare
of who, in turn, are differentiated
from survey respondents who thought we Entered/Removed."
"Variables spend too little money on welfare.
The most important predictor of groups based on responses to opinion about spending on
welfare was number of hours worked in the past week. The second most important
predictor of groups based on responses to opinion about spending on welfare was self-
employment. The third most important predictor of groups based on responses to opinion
about spending on welfare was highest year of school completed.
Survey respondents who thought we spend about the right amount of money on welfare worked
fewer hours in the past week than survey respondents who thought we spend too much or little
money on welfare. Survey respondents who thought we spend about the right amount of money
on welfare had completed more years of school than survey respondents who thought we
spend too much or little money on welfare. Survey respondents who thought we spend too
much money on welfare were more likely to be self-employed than survey respondents who
thought we spend too little money on welfare.
SW388R7
Slide 79
The most important predictor of groups based on responses to opinion about spending on
welfare was number of hours worked in the past week. The second most important predictor of
groups based on responses to opinion about
We spending on welfare
verified that was self-employment. The
each statement
third most important predictor of groups about
basedthe
on relationship
responses tobetween
opinion about spending on
welfare was highest year of school completed.
predictors and groups was correct.
Survey respondents who thought we spend about the right amount of money on welfare
worked fewer hours in the past week than survey respondents who thought we spend too
much or little money on welfare. Survey respondents who thought we spend about the right
amount of money on welfare had completed more years of school than survey respondents
who thought we spend too much or little money on welfare. Survey respondents who
thought we spend too much money on welfare were more likely to be self-employed than
survey respondents who thought we spend too little money on welfare.
1. True
The answer to the question is true with
2. True with caution caution. A caution is added because of
3. False the inclusion of ordinal level variables. A
caution is added because of a violation
4. Inappropriate application of a statistic of discriminant analysis assumptions.
SW388R7
Data Analysis & Steps in discriminant analysis:
level of measurement and initial sample size
Computers II
Slide 80
Yes
Yes
Number of cases in
smallest group greater No Inappropriate
than number of application of
independent variables? a statistic
Yes
SW388R7
Data Analysis & Steps in discriminant analysis:
running the baseline model
Computers II
Slide 81
Try:
1. Logarithmic transformation
Metric IV’s normally No 2. Square root transformation
distributed? 3. Inverse transformation
Yes
No
SW388R7
Data Analysis & Steps in discriminant analysis:
picking discriminant model for interpretation
Computers II
Slide 82
Yes
Cross-validated accuracy
for second discriminant
analysis greater than
accuracy of baseline by 2%
Yes or more?
No
Slide 83
Sufficient statistically No
significant functions to False
distinguish DV groups?
Yes
Yes
SW388R7
Data Analysis & Steps in discriminant analysis:
assumption of equal dispersion
Computers II
Slide 84
No
No Accuracy rate at least 2%
higher using separate-
groups covariance
matrices?
Yes
Slide 85
No
Entry order of variables
interpreted correctly?
No
False
Yes
Relationships between No
individual IVs and DV groups False
interpreted correctly?
Yes
SW388R7
Data Analysis & Steps in discriminant analysis:
classification accuracy
Computers II
Slide 86
Cross-validated accuracy is No
25% higher than proportional False
by chance accuracy rate?
Yes
SW388R7
Data Analysis & Steps in discriminant analysis:
adding cautions to solution
Computers II
Slide 87
Yes
Yes
Yes
True