Você está na página 1de 134

A Step-by-Step Guide to Analysis and Interpretotion

Brian C. Cronk

I
ll

:,-

Choosing the Appropriafe Sfafistical lesf


Ytrh.t b Yq I*l QraJoi?

Dtfbsr h ProportdE

Mo.s Tha 1 lndFndont Varidl6

lldr Tho 2 L6Eb d li(bsxlq* Vdidb

lhre Thsn 2 Lwls of Indopnddtt Varisd

f'bre Tha 'l Indopadqrl Vdbue

'|
Ind.Fddrt Vri*b

fro.! Itn I l.doFfihnt Vdi.bb

NOTE: Relevant numbers section are givenin parentheses. instance, For '(6.9)" you refers to Section in 6.9 Chapter 6.

Notice Inc. images@ by SPSS, Inc. Screen SPSSis a registered trademark SPSS, of Usedwith permission. and MicrosoftCorporation. by or This book is not approved sponsored SPSS.

Corporation. "Pyrczak A Publisher, California Publishing" an imprintof FredPyrczak, is and the Althoughtheauthor publisher and havemade everyeffortto ensure accuracy for no responsibility completeness information of contained thisbook,we assume in Any slightsof people, herein. inaccuracies, errors, omissions, anyinconsistency or places, organizations unintentional. or are Director: Project MonicaLopez. M. MatthewGiblin,Deborah Oh, Consulting Editors: Bumrss, L. George Jose Galvan, Rasor. JackPetit.andRichard provided CherylAlcorn,Randall Bruce,KarenM. Disner, R. Editdrialassistance by Brenda Koplin,EricaSimmons, Sharon and Young. Kibler andLarryNichols. Coverdesign Robert by in Printed theUnitedStates America Malloy,Inc. of by All Publisher. rights Copyright 2008,2006,2004,2002,1999 FredPyrczak, @ by in or reserved. portionof thisbookmaybe reproduced transmitted anyform or by any No means withouttheprior writtenpermission thepublisher. of r s B N l -8 8 4 s8 5 -79 -5

Tableof Contents
Introduction theFifth Edition to What'sNew? Audience Organization SPSS Versions Availability SPSS of Conventions Screenshots Practice Exercises Acknowledgments '/ Chapter I Ll t.2 1.3 1.4 1.5 1.6 1.7 Chapter 2 Getting Started Starting SPSS Entering Data DefiningVariables Loading Saving and DataFiles Running Your FirstAnalysis Examining Printing and OutputFiles Modi$ing DataFiles Entering ModifyingData and Variables DataRepresentation and Transformation Selection Data and of Descriptive Statistics Frequency Distributions percentile and Ranks a singlevariable for Frequency Distributions percentile and Ranks Multille variables for Measures CentralTendency Measures Dispersion of and of for a Single Group Measures Central of Tendency Measures Dispersion and of for MultipleGroups Standard Scores Graphing Data Graphing Basics TheNew SPSS ChartBuilder Bar Charts, Charts, Histograms Pie and Scatterplots Advanced Charts Bar EditingSPSS Graphs
Predictionand Association PearsonCorrelation Coeffi cient SpearmanCorrelation Coeffi cient Simple Linear Regression Multiple Linear Regression
v v v v vi vi vi vi vii vii

I I I 2 5 6 8 ll ll t2

2.1 ') ')


Chapter 3

l7 t7 20
2l 24
)7

3.1 3.2 3.3 3.4 3.5


Chapter 4 4.1 4.2 4.3 4.4 4.5 4.6
Chapter5

29 29 29 3l 33 36 39 4l 4l 43 45 49

5.1 5.2 5.3 5.4

u,

Chapter 6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 Chapter 7 7.1 7.2 7.3 7.4 7.5 7.6 Chapter 8 8.1 8.2 8.3 8.4 Appendix A Appendix B

Parametric Inferential Statistics Review BasicHypothesis of Testing Single-Sample t Test Independent-Samples I Test Paired-Samples t Test One-Way ANOVA Factorial ANOVA Repeated-Measures ANOVA Mixed-Design ANOVA Analysis Covariance of MultivariateAnalysisof Variance (MANOVA) Nonparametric Inferential Statistics Chi-Square Goodness Fit of Chi-Square Testof Independence Mann-Whitney UTest WilcoxonTest Kruskal-Wallis Test ,F/ Friedman Test TestConstruction Item-Total Analysis Cronbach's Alpha Test-Retest Reliability Criterion-Related Validiw Effect Size Practice Exercise DataSets Practice DataSet I Practice DataSet2 Practice DataSet3

53 53 )) 58 6l 65 69 72 75 79 8l 85 85 87 .90 93 95 97 99 99 100 l0l t02 103

r 09
109 ll0 ll0 lt3

Appendix C Appendix D

Glossary Sample DataFilesUsedin Text COINS.sav GRADES.sav HEIGHT.sav QUESTIONS.sav RACE.sav SAMPLE.sav SAT.sav OtherFiles

tt7 n7
l l7 l l7

n7
l18 l18 lt8 lt8 l19 t2l

AppendixE Appendix F

Information Users Earlier for Versions SPSS of of Graphing Datawith SPSS 13.0 and 14.0

tv

I Chapter

GettingStarted
1.1 StartingSPSS Section
Startup proceduresfor SPSS will differ of on slightly, depending the exact configuration the machine on which it is installed.On most computers,you can start SPSS by clicking on Start, then clicking on Programs, then on SPSS. there will be an SPSSicon On many installations, on the desktopthat you can double-click to start the program. When SPSS is started,you may be presentedwith the dialog box to the left, depending on the optionsyour systemadministratorselected for your version of the program. If you have the dialog box, click Type in data and OK, which will present blank data window.' a If you were not presented with the dialog box to the left, SPSSshould open automatically with a blank data window. The data window and the output window provide the basic interface for SPSS. A blank data window is shownbelow.

ffi$ t't****
ffi c rrnoitllttt
(- lhoari{irgqrory r,Crcrt*rsrcq.,y urhgDd.b6.Wbrd (i lpanrnaridirgdataura

f- Dml*ro* fe tf*E h lholifrra

1.2 EnteringData Section


One of the keys to success with SPSSis knowing how it stores gxr__}rry".** .:g H*n-g:fH" r! *9*_r1_*9lt and uses your data. To illustrate the rtlxlel&l *'.1 ale| lgj'Slfil Hl*lml sl el*l I rtl basics of data entry with SPSS,we 1.2.1. will useExample Example 1.2.1 A surveywas given to several students from four different classes (Tues/Thurs momings, Tues/Thurs afternoons, Mon/Wed/Fri mornings, and afternoons). Mon/Wed/Fri The students were asked
in ' Items that appearin the glossaryare presented bold. Italics areusedto indicatemenu items.

ChapterI GeningStarted

whether or not they were "morning people" and whether or not they worked. This survey also asked for their final grade in the class (100% being the highest gade possible). below: are The response sheets from two students presented

Response Sheet I ID: Dayof class: Class time: person? Are you a morning Finalgrade class: in Do youwork outside school?

4593 MWF Morning Yes X TTh X Aftemoon X No


Part{ime

8s%
Full-time

XNo Response 2 Sheet ID: Dayof class: Class time:


Are you a morningperson?

l90l x MwF X Morning


X Yes

_ -

TTh Afternoon
No

Finalgrade class: in Do vou work outside school?

83% Full-time No

X Part-time

into SPSSfor use in future Our goal is to enterthe data from the two students Any informaanalyses. first stepis to determine variables needto be entered. The that the Example participants a variable to that needs be considered. tion that can vary among is 1.2.2 liststhevariables will use. we Example 1.2.2 ID Dayof class Class time Morningperson Finalgrade Whether not the student worksoutside school or particivariables rowsrepresent and In the SPSS represent data window,columns (variables) two rows and pants. Therefore, will be creating datafile with six columns we a (students/participants). Section1.3 Defining Variables about Beforewe can enterany data,we must first entersomebasicinformation that: variable into SPSS. instance, For mustfirst be givennames each variables o beginwith a letter; o do not contain space. a

Chapter I Getting Started

Thus, the variable name "Q7" is acceptable, while the variable name "7Q" is not. Similarly, the variable name "PRE_TEST" is acceptable, but the variable name "PRE TEST" is not. Capitalizationdoes not matter, but variable namesare capitalizedin this text to make it clear when we are referring to a variable name, even if the variable name is not necessarily capitalizedin screenshots. To define a variable.click on the Variable View tab at thebottomofthema inscre e n .Th is wills h o wy o u t h e V a ri-@ able View window. To return to the Data View window. click on the Data View tab.
Fb m u9* o*.*Trqll

,'lul*lEll r"l*l ulhl **l {,lrl EiliEltfil_sJ elrl

t!-.G q".E u?x !!p_Ip

.lt-*l*lr"$,c"x.l

From the Variable View screen,SPSSallows you to createand edit all of the variables in your data file. Each column represents some property of a variable, and each row represents variable. All variablesmust be given a name. To do that, click on the first a empty cell in the Name column and type a valid SPSSvariable name. The program will then fill in default valuesfor most of the other properties. One usefulfunctionof SPSSis the ability to definevariableand value labels.Variable labels allow you to associate descriptionwith each variable.Thesedescriptionscan a describethe variablesthemselves the valuesof the variables. or Value labelsallow you to associate description a with eachvalue of a variable.For example,for most procedures, SPSSrequiresnumerical values.Thus, for data such as the day of the class (i.e., Mon/Wed/Fri and Tues/Thurs),we need to first code the values as numbers.We can assignthe number I to Mon/Wed/Friand the number2to Tues/Thurs. To help us keep track of the numberswe have assigned the values,we use value labels. to To assignvalue labels,click in the cell you want to assignvaluesto in the Values column. This will bring up a small gray button (seeanow, below at left). Click on that button to bring up the Value Labelsdialog box. --When you enter a iv*rl** v& 12 -Jil value label, you must click L.b.f ll6rhl| s*l !!+ | Add after eachentry. This will
mOVe the value and itS associated label into the bottom section of J::::*.-,.Tl

the window. When all labels have been added, click OK to return to the Variable View window.

ChapterI Gening Starred

In additionto namingand labelingthe variable, you havethe option of definingthe variabletype. To do so, simply click on the Type, Width,or Decimals columns in the Variable View window. The default value is a numeric field that is eight digits wide with two decimalplacesdisplayed. your dataare more than eight digits to the left of the decimal If place,they will be displayedin scientificnotation(e.g.,the number2,000,000,000 be will displayed 2.00E+09).'SPSSmaintains as accuracy beyondtwo decimalplaces, all outbut put will be roundedto two decimal placesunlessotherwiseindicatedin the Decimals column. In our example, will be usingnumericvariables we with all of the defaultvalues. Practice Exercise Createa data file for the six variablesand two samplestudents presented Examin ple 1.2.1.Name your variables: DAY, TIME, MORNING, GRADE, and WORK. You ID, should code DAY as I : Mon/Wed/Fri,2 = Tues/Thurs.Code TIME as I : morning, 2 : afternoon. CodeMORNING as 0 = No, I : Yes. Code WORK as 0: No, I : Part-Time, 2 : Full-Time. Be sure you enter value labels for the different variables.Note that because value labelsare not appropriatefor ID and GRADE, theseare not coded.When done,your Variable View window should look like the screenshot below: J -rtrr,d

{!,_q, ru.g

r9"o' ldq${:ilpt"?- "*- .? --

Click on the Data View tab to open the data-entryscreen.Enter data horizontally, beginningwith the first student'sID number.Enter the code for eachvariable in the appropriate column; to enterthe GRADE variablevalue,enterthe student'sclassgrade.

F.E*UaUar Qgtr Irrddn

An hna gnphr Ufrrs

Hhdow E*

ot otttr *lgl dJl ulFId't lr*lEl&lr6lgl slglglqjglej blbl Al 'r i-l-Etetmt olrt'

Depending upon your versionof SPSS, may be displayed 2.08 + 009. it as

Chapter I Getting Started

Theprevious data window canbe changed look instead the screenshot to like beclickingon the Value Labelsicon(seeanow).In this case, cellsdisplay the l*.bv value labelsratherthanthe corresponding codes. datais entered this mode,it is not necesIf in saryto entercodes, clickingthebuttonwhichappears eachcell asthe as in cell is selected will present drop-down of thepredefined a list lablis. You may useeithermethod, according to yourpreference.

: [[o|vrwl

vrkQ!9try /

*rn*to*u*J----.-tu l{il Ddr lrm#m

)1

Instead of clicking the Value Labels icon, you may optionallytogglebetween views by clicking valueLaiels under the Viewmenu. Section1.4 Loading and SavingData Files Onceyou haveentered your data,you will need to saveit with a uniquenamefor later useso that you canretrieve whennecessary. it Loadingand savingSpSSdatafiles worksin the sameway as most Windows-based software. Underthe File menu, there are Open, Save, and Save As commands. SPSSdata files have a .,.sav" extension. which is addedby defaultto the end of the filename. ThistellsWindows thefile is anSpSS that datafile. Save Your Data

Anrfrrr Cr6l!

r ti
il
'i. I

,t1

,t

r lii

|:

H-

When you save your data file (by clicking File, then clicking Save or SaveAs to specify a unique name),pay specialattentionto where you saveit. trrtist systemsdefault to the.location<c:\programfiles\spss>.You will probably want to saveyour data on a floppy disk, cD-R, or removableUSB drive so that you can taie the file withvou.

Load YourData
When you load your data (by clicking File, then clicking Open, thenData, or by clicking the open file folder icon), you get a similar window. This window lists all files with the ".sav" extension.If you have trouble locating your saved file, make sure you are D{l lriifqffi looking in the right directory.

ChapterI GeningStarted

Practice Exercise To be surethat you havemastered saving and openingdata files, nameyour sample datafile "SAMPLE" andsaveit to a removable
FilE Edt $ew Data Transform Annhze @al

storagemedium. Once it is saved,SPSSwill display the name of the file at the top of the data window. It is wise to save your work frequently,in caseof computer crashes. Note that filenamesmay be upper- or lowercase. this text, uppercase usedfor clarity. is In After you have saved your data, exit SPSS (by clicking File, then Exit). Restart SPSS and load your databy selecting "SAMPLE.sav"file you just created. the

Section 1.5 RunningYour First Analysis


Any time you open a data window, you can mn any of the analyses To available. get started,we will calculatethe students'averagegrade.(With only two students, you can easily checkyour answerby hand,but imaginea data file with 10,000studentrecords.) The majority of the available statistical tests are under the Analyze menu. This menu displaysall the optionsavailablefor your versionof the SPSSprogram (the menusin this book were created may haveslightly with SPSSStudent Version 15.0).Otherversions different setsof options.

j
File Edlt Vbw Data Transformnnafzc Gretrs UUtias gdFrdov*Help I
(llnl El tlorl rl

rttrtJJ

r ktlml lff
Cottpsr Milns )

al ol lVisible:6

GanoralHnnar ) i f&dd ,) Corr*lrtr ) Re$$r$on 't901.00 ir l. Classfy ,. ), . OdrRrdrrtMr ) Scab Norparimetrlc lcrtt l ) Tirna 5arl6t ) Q.rlty Corfrd I tj\g*r*qgudrr,*ts"ussRff(trve,.,

Eipbrc,,. CrogstSr,.. Rdio,., P-P flok,., Q Phs.,,

To calculatea mean (average), are asking the computerto summarizeour data we set. Therefore,we run the commandby clicking Analyze, then Descriptive Statistics, then Descriptives. This brings up the Descriptives dialog box. Note that the left side of the box containsa OAY list of all the variablesin our data file. On the right .Sr ql is an area labeled Variable(s), where we can 3s,l specifythe variableswe would like to use in this particular analysis. A*r*.. I f- 9mloddrov*p*vri*lq

Chapter I Getting Started

l:rt.Ij We want to compute the mean for the variable called GRADE. Thus, we need to select the variable name in the left window (by clicking ;F* | on it). To transfer it to the right window, click on -t:g.J the right arrow between the two windows. The -!tJ arrow always points to the window opposite the f- Smdadr{rdvdarvai& PR:l highlighted item and can be used to transfer selectedvariablesin either direction.Note that double-clickingon the variable name will also transfer the variable to the opposite window. Standard Windows conventions of "Shift" clicking or "Ctrl" clicking to selectmultiplevariables be usedas well. can When we click on the OK button, the analysiswill be conducted,and we will be readyto examineour output.

in

Section 1.6 Examiningand PrintingOutput Files


After an analysis is performed, the output is placed in the output window, and the output window becomesthe active window. If this is the first analysis you have conducted since starting SPSS, then a new output window will be created.If you have run previous analyses saved and them,your

outputis added theendof yourprevious to output. To switchbackandforthbetween data window andtheoutput window,select the thedesired windowfrom the Window menubar(seearrow,below). The output window is split into two sections. left section an outlineof the The is (SPSS "outlineview").Theright section theoutputitself. output refersto this asthe is
H. Ee lbw A*t lra'dorm Craphr ,Ufr!3 Uhdo'N Udp

I -d * lnl-Xj irllliliirrillliirrr

slsl*gl l elsl*letssJ rl sl#_# + l *l + l - l &hj


ornt El Pccc**tvs* r'fi Trb

-qg*g!r*!e!|ro_

:l ql el , * Descrlptlves
f]aiagarll

6r**

lS Adi\D*ard ffi Dcscrtfhcsdkdics

l: \ lrrs

datc\ra&ple.lav

lle*crhlurr
N ufinuc
I

Sl.*liilca Xsrn

Mlnlmum Hadmum 83.00 85.00

81,0000

Std. Dwiation 1.41421

valldN (|lstrylsa)

ffiffi?iffi

rr---*.*

r*4

The sectionon the left of the output window provides an outline of the entire output window. All of the analyses listed in the order in which they were conducted. are Note that this outline can be used to quickly locate a sectionof the output. Simply click on the sectionyou would like to see,and the right window will jump to the appropriate place.

ChapterI GeningStarted

Clicking on a statistical procedure all also selects of the output for that command. By pressingtheDeletekey, that outputcan be deletedfrom the output window. This is a quick way to be sure that the output window containsonly the desiredoutput. Output can also be selectedand pastedinto a word processorby clicking Edit, then Copy Objecls to copy the output.You can then switch to your word processor and click Edit, thenPaste. To print your output, simply click File, then Print, or click on the printer icon on the toolbar. You will have the option of printing all of your output or just the currently selected section. Be careful when printing! Each time you mn a command, the output is addedto the end of your previous output. Thus, you could be printing a very large output file containinginformation you may not want or need. One way to ensurethat your output window containsonly the resultsof the current commandis to createa new output window just before running the command.To do this, click File, then New, then Outpul. All your subsequent commandswill go into your new output window. Practice Exercise Load the sampledata file you createdearlier (SAMPLE.sav). Run the Descriptives command for the variable GRADE and print the output. Your output should look like the exampleon page7. Next, selectthe data window and print it.

Section 1.7 ModifyingData Files


Once you have createda data file, it is really quite simple to add additional cases (rows/participants) additional (columns). or Example1.7.1. variables Consider

Example 1.7.1 Two morestudents provide with surveys. you Theirinformation is:
Response Sheet3 ID: Day of class: Classtime: Are you a morningperson? Final gradein class: Do you work outsideschool?

8734
MWF Morning Yes

X TTh Afternoon XNo Part-time

80%
Full-time No

Response Sheet4 ID: Day of class: Classtime: Are you a morning person? Final gradein class: Do you work outsideschool?

1909 X MWF X Morning X Yes 73% Full+ime No

TTH Afternoon No
X Part-time

Chapter I Getting Started

To add thesedata, simply place two additionalrows in the Data View window (after loading your sampledata).Notice that as new participantsare added,the row numbers becomebold. when done,the screenshouldlook like the screenshot here.
j '.., .l lrrl vl

nh E*__$*'_P$f_I'Sgr

&1{1zc Omhr t$*ues $ilndonHug_

Tffiffi
1

2 3
)

ID DAY TIME MORNING GRADE WORK 4593.00 Tueffhu aternoon No 85.00 No 1gnl.B0 MonMed/ m0rnrng Yes 83.00 Part-Time 8734.00 Tue/Thu mornrng No 80,00 No 1909.00MonAfVed/ mornrng Yeg 73.00 Part-Time

var

\^

. mfuUiewffi

rb$ Vbw /

l{ l
Procus*r ready ls 15P55 I
'.-

I
-

rll
-,,,---Jd*

,4

New variables can also be added.For example, if the first two participantswere given specialtraining on time management, the two new participantswere not, the data and file can be changedto reflect this additionalinformation.The new variable could be called TRAINING (whether or not the participant receivedtraining), and it would be coded so that 0 : No and I : Yes. Thus, the first two participantswould be assigneda "1" and the Iast two participantsa "0." To do this, switch to the Variable View window, then add the TRAINING variable to the bottom of the list. Then switch back to the Data View window to updatethe data.
f+rilf,t tt Inl vl

Sa E& Uew Qpta lransform &rpFzc gaphs Lffitcs t/itFdd^,SE__--

14:TRAINING l0
1
I

i lvGbtr of
1r

3 4
I
(l)

t0 NAY TIME MORNING GRADE woRK I mruruwe 4593.0f1 Tueffhu aterncon No 85.0u No l Yes yes 1901.OCI ManA/Ved/ m0rnrng Yes ffi.0n iiart?mel8734"00 Tueffhu momtng No 80.n0 Noi No 1909.00 onrlVed/ morning M Yes 73.00 Part-Time No ' I
.r View { Vari$c Vlew

isPssW

l-.1 =J "

rll'l
,i

Adding data and adding variablesare just logical extensionsof the procedures we used to originally createthe data file. Save this new data file. We will be using it again later in the book.

Chapter I Getting Started

Practice Exercise Follow the exampleabove(whereTRAINING is the new variable).Make the modifications your SAMPLE.sav file andsaveit. to data

l0

2 Chapter

EnteringandModifying Data
In Chapter 1, we learnedhow to createa simple data file, save it, perform a basic analysis,and examine the output. In this section,we will go into more detail about variablesand data.

2.1 Variablesand DataRepresentation Section


as In SPSS,variablesare represented columns in the data file. Participantsare represented rows. Thus, if we collect 4 piecesof information from 100 participants,we will as have a data file with 4 columnsand 100 rows. Measurement Scales scales:nominal, ordinal, interval, and ratio. There are four types of measurement for scalewill determinewhich statisticaltechniqueis appropriate a While the measurement not discriminate.Thus, we start this section with given set of data, SPSSgenerally does this warning: If you ask it to, SPSSmay conduct an analysis that is not appropriatefor scales,consultyour your data.For a more completedescriptionof thesefour measurement statistics text or the glossaryin Appendix C. Newer versionsof SPSS allow you to indicate which types of Measure data you have when you define your variable. You do this using the Measurecolumn. You can indicateNominal, Ordinal, or Scale(SPSS @Nv doesnot distinguishbetweeninterval and ratio scales). f $cale .sriltr Look at the sampledata file we createdin Chapter l. We calcur Nominal on lated a mean for the variable GRADE. GRADE was measured a ra-

(assuming the distribution statistic that summary tio scale,andthe mean is an acceptable is normal). a We could have had SPSScalculate mean for the variableTIME insteadof here. gettheoutputpresented GRADE.If we did, we would that TIME was 1.25.Remember TIME was that The outputindicates the average
coded as an ordinal variable ( I = m or ni ngcla ss,2-a fte rnoon g trlllql eilr $lclass).Thus, the mean is not an *lq]eH"N-ql*l appropriatestatisticfor an ordinal :* Sl astts .l.:D gtb it scale,but SPSScalculated any:$sh way. The importance of considering the type of data cannot be overemphasized. Just because ht6x0tMn SPSS will compute a statistic for you doesnot mean that you should
.6M6.ffi

$arlrba"t S#(|
LS a 2.qg
Lt@

ll

Chapter Enteringand Modifying Data 2

use it. Later in the text, when specificstatistical procedures discussed, conditions the are underwhich they are appropriate will be addressed. Missing Data you may have Often, participantsdo not provide completedata.For some students, a pretestscore but not a posttestscore.Perhapsone studentleft one question blank on a survey,or perhapsshe did not stateher age.Missing data can weakenany analysis.Often, can eliminatea suba singlemissingquestion ject from all analyses. ql total If you have missing data in your data 2.00 2.Bn 4.00 set, leave that cell blank. In the example to 3.00 1.0 0 4.00 the left, the fourth subject did not complete Question2. Note that the total score(which is 4.00 3.00 7.00 calculatedfrom both questions)is also blank becauseof the missing data for Question 2. 2.00 missing data in the data SPSS represents 1 .0 0 2.UB 3.00 window with a period (althoughyou should not enter a period-just leave it blank).

Section 2.2 Transformation and Selection Data of


We oftenhavemoredatain a data thanwe wantto includein a specific analyfile sis.For example, sample our two datafile contains datafrom four participants, of whom received special trainingand two of whom did not. If we wantedto conductan analysis usingonly the two participants did not receive training, wouldneedto specify we who the theappropriate subset. Selectinga Subset
F|! Ed vl6{ , O*. lr{lrfum An*/& e+hr O*fFV{ldrr PrS!tU6.,. CoptO.tafropc,tir3,.. (

t'llitl&JE

il :id

l,j.l,/r,:irrlrr! lif l ll:L*s,,.

Hh.o*rr,., Dsfti fi*blc Rc*pon$ 5ct5,,,

We can use the SelectCasescommandto specify a subset of our data. The Select Cases command is located under the Data menu. When you select this command,the dialog box below will appear.
q*d-:-"-- "-"""-*--*--**-""*-^*l
6 Alce a llgdinlctidod ,r l

ConyD*S

r irCmu*dcaa

i*np* |
sd.rt Csat
{^ lccdotincoarrpr

i :

;.,* |

-:--J llaffrvci*lc

You can specify which cases(participants) you want to select by using the selection criteria, which appearon the right side of the Select Cases dialog box.
C6ttSldrDonoan!.ffi

l0&t

foKl

aar I c-"rl x* |

t2

Chapter2 Enteringand Modifying Data

By default,All caseswill be selected. The most common way to selecta subsetis to click If condition is satisfied, then click on the button labeledfi This will bring up a new dialog box that allowsyou to indicate which cases you would like to use. You can enter the logic used to select the subset in the upper section. If the logical statement is true for a given case, then that case will be U;J;J:.1-glL1 E{''di',*tI , 'J-e.l-,'J lJ.!J-El [aasi"-Eo,t----i selected.If the logical statement is false. that case will not be 0 U IAFTAN(r"nasl ,Jl _!JlJ selected.For example, you can sl"J=tx -s*t"lBi!?Blt1trb :r select all casesthat were coded ?Ais"I c'-t I Ht I as Mon/Wed/Fri by entering the formula DAY = I in the upperright part of the window. If DAY is l, thenthe statement will be true,and SPSSwill select the case.If DAY is anything other than l, the statement will be false, and the casewill not be selected.Once you have enteredthe logical statement,click Continue to return to the SelectCases dialog box. Then,click OK to returnto the data window. After you have selected cases, data window will changeslightly. the the The casesthat were not selected will be markedwith a diagonalline through the casenumber. For example,for our sampledata, the first and third casesare not selected. only the secondand fourth cases selected this subset. are for

= ilqex4q lffiIl,?,l*;*"'

, I I
I

i{

,1

'l

EffEN'EEEgl''EEE'o

,.,:r.

rt

lnl vl

1
I

!k_l**

-#gdd.i.&l Flib'/,-<
2-'4 4

1 :
TIME

*-

MORNING ERADE WORK TRAINING 4533.m Tueffhui affsrnoon No ffi.m Na Yes Not Selected 1901.m MpnMed/i mornino Yss 83,U1Fad-Jime Yes Splacled 6h4lto TuElThu morning No m.m . No No Not Selected ieifrfft MonA/Ved/1 morning Yes ru.mPart-Time No
. -..- ^,-.-.*.*..,-J.- . - .-..,..".*-....- ':

ID

'l

'l

v,itayss 7 !LJ\ii. vbryJ

fsPssProcaesaFrcady

. *-J I

*] ,1,

An additional variable will also be createdin your data file. The new variable is called FILTER_$ and indicateswhethera casewas selected not. or If we calculatea mean Descripthre Stailstics GRADE using the subset we just selected,we will receive std. N Minimum Maximum M e a n Deviation the output at right. Notice that UKAUE 2 73.00 83.00 78.0000 7 . 0 7 1 1 we now have a mean of 78.00 Va lid N 2 IliclwisP'l with a samplesize (M) of 2 insteadof 4.

l3

2 Chapter Enteringand Modifying Data

Be careful when you selectsubsets.The subsetremains in ffict until you run the the because commandagain and selectall cases.You can tell if you have a subsetselected you examine bottom of the data window will indicatethat a filter is on. In addition, when your output, N will be less than the total number of recordsin your data set if a subsetis The diagonal lines through some caseswill also be evident when a subsetis seselected. as lected.Be careful not to saveyour data file with a subsetselected, this can causeconsiderableconfusionlater. Computing a New Variable SPSScan also be used to compute a new variable or nh E* vir$, D.tr T|{dorm manipulateyour existing vari*lslel EJ-rlrj -lgltj{l -|tlf,l a*intt m eltj I ables. To illustrate this, we will create a new data file. This file will contain data for four participants and three variables (Ql, Q2, and Q3). The variables represent the points each number of l* ,---- LHJ participant received on three {#i#ffirtr!;errtt*; different questions.Now enter the data shown on the screen to the right. When done, save this data file as "QUESTIONS.sav." will be usingit againin laterchapters. We
I TrnnsformAnalyze Graphs Utilities Whds

into Rersde 5ame Variable*,,, into Varlables. Racodo Dffferant ,, Ar*omSic Rarode,,. Vlsual 8inrfrg,..

Now you will calculatethe total score for eachsubject.We could do this manually,but if the data file were large, or if there were a lot of questions,this would take a long time. It is more efficient (and more accurate) to have SPSS compute the totals for you. To do this, click Transform and then click Compute Variable.

After clicking the Compute Variable command, we get the dialog box at right. The blank field marked Target Variable is where we enter the name of the new variable we want to create. In this example, we are creating a variable called TOTAL, so type the word "total." Notice that there is an equals sign between the Target Variable blank and the Numeric Expression blank. These two blank areas are the

rrw I i+t*... gl w ca

*l

U $J-:iidijl lij -!CJ:l Jslcl rtg-sJ ll;s


rt rt rl ,_g-.|J :3 lll--g'L'"J

lllmr*dCof

0rr/ti*

til

&fntndi) Oldio.
E${t iil

:J

, rr | {q*orfmsrccucrsdqf

n* ri c* rl

"*l

l4

Chapter2 Enteringand Modifying Data

iii:Hffiliji:.:

. i . i> t

ii"alCt

i-Jr:J::i l:j -:15 tJ -tJ-il is:Jlll

i-3J:J JJJI --q-|J --q*J

J- l

|f- - | ldindm.!&dioncqdinl

tsil

nact I

c:nt I

x*

two sides of an equation that SPSS will calculate. For example, total : ql + q2 + q3 is the equation that is entered in the sample presentedhere (screenshot left). Note that it is posat sible to create any equation here simply by using the number and operational keypad at the bottom of the dialog box. When we click OK, a SPSSwill create new variablecalled TOTAL and make it equalto the sum of the threequestions. Save your data file again so that the new variablewill be available for future sessions.

t::,,

Eile gdit SEw Qata lransform $nalyza 9aphs [tilities Add'gns Sindow Help

- ltrl-Xl

3.n0
4.00

3.0n

4,n0

10.00

31 2.oo ..........;. l
41
I

2.oo
3 0 01

1.0 0 1

.:1

il I i , l, l\qg,t_y!"*_i Variabte ViewJ

l-'r --i-----i

lit W*;

r ljl

Recodinga Variable-Dffirent

Variable
F{| [dt
---.:1.l{ rr 'r trl

SPSS can create a new variable based upon data from another variable. Say we want to split our participants the basisof on their total score.We want to create a variablecalled GROUP, which is coded I if the total score is low (lessthan or equal to 8) or 2 if the total score is high (9 or larger). To do this, we click Transform, then Recodeinto Dffirent Variables.

!la{

Data j Trrx&tm

Analrra

,-.lu l,rll r-al +.


I I

vdiouc',' conp$o
Cd.nVail'r*dnCasas.,,
-o ..* * ^ c- - u - r - c

4.00

Art(tn*Rrcodr... U*dFhn|ro,,.

2.00

i.m
Racodr 0ffrror* Yal lrto
S*a *rd llm tllhsd,,, Oc!t6 I}F sairs.., Rid&c l4sitE V*s.,. Rrdon iMbar G.rs*trr,,.

l5

Ch a p te 2 En te ri n g n d Mo d i fy i n gD ata r a

This will bring up the Recode into Different Variables dialog box shown here. Transfer the variable TOTAL to the middle blank. Type "group" in the Name field under Output Variable.Click Change,and the middle blank will show that TOTAL is becoming GROUP.as shownbelow.
NtnHbvli|bL-lo|rnrV*#r

til

ladtnl c rlccdm confbil

-'tt"

rygJ **l-H+ |

To help keep track of variablesthat have been recoded, it's a good idea to open the t *.!*lr Variable View and enter "Recoded" in the Label rr&*ri*i*t column in the TOTAL row. This is especially ;rlnr-":-'-'1** I useful with large datasets which may include i T I r nryrOr:frr**"L many recodedvariables. ,fClick Old andNew Values.This will bring i c nq.,saa*ld6lefl; Fup the Recodedialog box. In this example,we have entered a 9 in the Range, value through HIGHEST field and a 2 in the Value field under New Value.When we click Add, the blank on the ,.F--*-_-_-_____ right displaysthe recodingformula. Now enter an :I "a *r***o lrt*cn*r I I nni. 8 on the left in the Range, LOWEST through rT..".''..."...value blank and a I in the Value field under New I ir:L-_t' Value. Click Add, then Continue.Click OK. You l6F i4i'|(tthah* ; T &lrYdd.r*t will be redirectedto the data window. A new I " n *'L,*l'||.r.$, : r----**-: variable (GROUP) will have been added and ; r {:ei.* gf-ll codedas I or 2, based TOTAL. on
lirli
i " r, . ! * r h ^ . , " , r '-

li-l a ri r, r it : . ' I

y..,t

$q I

'*J

*u"'." Flc Ed Yl.ly Drt! Tr{lform {*!c ce|6.,||tf^,!!!ry I+

-ltrlIl

l6

3 Chapter

Statistics Descriptive
in with ln Chapter we discussed manyof the options available SPSS dealing for 2, our waysto summarize data.The procedures usedto describe data.Now we will discuss andsummarize arecalleddescriptive statistics. data Section3.1 FrequencyDistributions and PercentileRanks for a SingleVariable Description produces The Frequencies frequency varicommand distributions the specified for percentages, percentages, ables. The outputincludes number occurrences, of valid the and percentages percentages. valid percentages the cumulative and cumulative The comprise as only thedatathatarenot designated missing. is TheFrequencies command usefulfor describing samples wherethe meanis not (e.g., useful It nominalor ordinal scales). is alsouseful a method getting feelof as of the just a meanandstandarddeviationandcan your data.It provides moreinformation than outliers.A special be usefulin determining feature the command of skewandidentifying percentile is its abilityto determine ranks. Assumptions percentages percentiles valid only for datathat are measured are Cumulative and on at leastan ordinal scale.Because outputcontains line for eachvalueof a varithe one with a relatively able,thiscommand worksbeston variables smallnumber values. of Drawing Conclusions produces TheFrequencies outputthatindicates of command boththenumber cases in the sample a particular with that value.Thus,convalueandthe percentage cases of of the in clusions drawnshouldrelateonly to describing numbers percentages cases the or of perconclusions If regarding cumulative the sample. the dataareat leastordinalin nature, percentiles be drawn. centage and/or can .SPSS Data Format frequency distributions The SPSS requires only onevariable, datafile for obtaining andthatvariable beof anytype. can

tt

C h a p te 3 D e s c ri p ti v e ti s ti c s r Sta

Creating a Frequency Distribution To run the Frequer?cies command, click Analyze, then Descriptive Statistics, I slsl}sl &99rv i:rl.&{l&l&l @ then Frequencies.(This example uses the i 1 r mpg } Disbtlvlr... cdrFrb'l{tirE i 18 N Erpbr,.. CARS.savdata file that comeswith SPSS. croac*a,.. Rrno,., It is typically located at <C:\Program F.Pt'lok,., aaPUs,., Fi les\SPS S\Cars. sav>. ) This will bring up the main dialog r5117gl box. Transfer the variable for which you would like a frequency distributioninto the Variable(s)blank to the right. Be surethat xl the Display frequency tables option is per Miles Gallon r q ! l checked.Click OK to receiveyour output. lm /Erqlr,onispUcamr / Hurepowor [horc Note that the dialog boxes in dv*,id"w"bir 1|ut jq? | newer versions of SPSS show both the d t!rc toAceileistc left .f"tq I type of variable(the icon immediately dr',Ccxr*yol Orbin [c He_l of the variable name) and the variable . labels if they are entered. Thus, the l7 Oisgay hequercy tder variable YEAR shows up in the dialog box as Model Year(modulo I0). sr**i,1..1 I rry*,:. I f*:.,. Output for a Frequency Distribution The output consists two sections. The first sectionindicatesthe numberof reof Recordswith a blank scoreare listedas cords with valid data for eachvariableselected. Notice that the variablelabel 406 records. missing.In this example, datafile contained the (modulo100). is ModelYear The second section of the output contains a statistics cumulative frequency distribution for each variable Wse lected.A tth e topof t h e s e c t io n , t h e v a ria b le la b e lis * oo? y.1"1 | of | given. The output iiself consists five columns.The first | Missing t I I I Jolumnliststhi valuesof the variablein sortedorder.There is a row for eachvalue of your variable, and additional rows are added at the bottom for the Total and Missing data. The secondcolumn gives the frequency of eachvalue,includingmissingvalues. The third columngivesthe percentage of all records (including records with missingdata) for eachvalue.The fourth column, labeled Valid Percenl, gives the percentage records(without including of records with missing data) for each value. If there were any missingvalues, these values would be larger than the valuesin column threebecause total the
Modol Yo.r (modulo 100) Cumulativs
Pcr ce n l

Valid P6rcnl I 4

vatE

34
72 73 74 75 76 77

28 40 27 30 34 28 29 29 30 31 405 1 406

I 4 7.1 6.9 9.9 6.7 8.4 6.9 8.9 7.1 7.1 7.4 7.6 99.8 100.0

7.2
6.9 9.9 6.7 7.4 8.4 6.9 8.9 f.2

E4 15.6 22.5 32.3 39.0 46.4 54.8 61.7 70.6 77.8 84.9 92.3 | 00.0

79 80 81 82 Total Missing 0 (Missing) Total

7.2
7.4 7.7

100 .0

r8

Chapter3 DescriptiveStatistics

numberof recordswould have beenreducedby the numberof recordswith missing values. The final column gives cumulative percentages. Cumulative percentages indicate the percentageof records with a score equal to or smaller than the current value. Thus, the last value is always 100%. These values are equivalentto percentile ranks for the values listed.

D eterm ining P ercenti Ie Ranl<s


The Frequencies command can be used to provide a number of descriptive Sfndr*Pi*rcsnr statistics,as well as a variety of percentile SHslsp{rierltuso values (including quartiles, cut points, and /v***v*$t*(ttu YI /lino toaccrbrar to !rydI scorescorresponding a specificpercentile $1C**{ry o{Origr[c rank). |*"1 To obtain either the descriptiveor lT Oirpbar frcqlcreyttblce percentile functions of the Frequencies command,click the Statistics button at the frfix*... I bottomof the main dialog box. Note that the Central Tendencyand Dispersior sectionsof this box are useful for calculatingvalues, suchas the Median or Mode. which cannotbe calculatedwith the Descriptiyescommand (seeSection 3.3).
:,,.
Mla pa Galmlm3

tril

This brings up the Frequencies:


Statisticsdialog box. Check any additional desiredstatisticby clicking on the blank next to it. For percentiles, enter the desired percentile rank in the blank to the right of the Percentile(s)label. Then, click Add to add it to the list of percentiles requested. Once you have selected your requiredstatistics, all click Continue to return to the main dialog box. Click OK.

xl

PscdibV.lrr

tr Ourilr3 F nrs**rtd!i*

I ,crnqo,p, i

c{q I *g"d I
Hdo I

f- Vdrixtgor0mi&ohlr Oi$.r$pn " l* SUaa** n v$*$i I* nmgc f Mi*n n |- Hrrdilrtl mcur l- S"E. 0idthfim' t- ghsrurt T Kutd*b

Statistics
ModelYear (modulo100 N Vatid Missing Percentiles 25 50 75 80

Outputfor Percentile Ranl<s


405 1 73.00 76.00 79.00 80.00

The Statistics dialog box adds on to the previous output from the Frequenciescommand.The new sectionof the output is shown at left. The output containsa row for each piece of information you requested. the example above, we In checkedQuartiles and asked for the 80th percentile. Thus, the output contains rows for the 25th, 50th. 75th,and 80th percentiles.

l9

Ch a p re ,1 D e s c ri p tie S ta ti s ti cs r r

Practice Exercise UsingPractice DataSet I in Appendix create frequency B, a distribution tablefor the mathematics skills scores. Determine mathematics the skills scoreat which the 60th percentile lies. section 3.2 FrequencyDistributions and percentileRanks for Multiple Variables Description The Crosslabs command produces frequency distributions multiple variables. for The outputincludes number occurrences eachcombination levelJof eachvarithe of of of able.It is possible havethecommand to givepercentages anyor all variables. for The Crosslabs command usefulfor describing is samples wherethe mean is not (e'g.,nominalor ordinal scales). is alsouseful a method getting feelfor useful It as for a your data. Assumptions Because outputcontains row or columnfor eachvalue of a variable.this the a command worksbeston variables a relatively with smallnumber values. of ,SPSS Data Format The SPSS data file for the Crosstabs I lnalyzc Orphn Ut||Uot command requires two or more variables. Those RcF*r ) variables be of anytype. can Runningthe CrosstabsCommand
(orprycrllcEnr ) G*ncral llrgar Flodcl ) , ) ;ilffi; ) chrfy DttaRcd.Etbn ) ) scah

This example uses SAMpLE.sav the data


file, which you createdin Chapter l. To run the procedure, ctick Analyze, then Descriptive Statistics,then Crosstabs.This will bring up ttt. main Crosstabs dialog box, below.
Ror{.} ftr;;ho..;lm&!

i,

r---r

l rJ I

TK I '-l ryq I

The dialog box initially lists all variables on the left and contains two blanks labeled Row(s) and Column(s). Enter one variable (TRAINING) in the Row(s) box. Enter the second (WORK) in the Column(s) box. To analyze more than two variables, you would enter the third, fourth, etc., in the unlabeled area(ust under theLayer indicator).

20

Chapter3 DescriptiveStatistics

percentages and other information to be generatedfor eachcombinationof values.Click Cells, and you will get the box at right. For the example presented here, check Row, Column, and Total percentages.Then click Continue. This will return you to the Crosstabs dialog box. Click OK to run the analvsis.
TRAINING' WURK oss|nl)tilntlo|l Cr
NO

The Cells button allows you to specify W:


t C",ti* |

t*"1

,"1

''Pdrl.!p. ;F Bu F corm

r-Bait*" : ,l- U]dadr&ad if- sragatrd

"1'"1 -_rry-ys___ . -

TRAINING Yes

No

Total

Count % within TRAININO % within woRK % ofTolal Count % within TRAINING % within WORK % ofTolal Count % within TRA|NtNo % wilhin WORK % ofTolal

WORK Parl-Time I 1 50.0% 50.0% 50. 0% 50.0% 25. 0% 25.0% 1 1 50.0% 50.0% 50. 0% 50.0% 25.0% 25.0%
a

Tolal

50.0% 100. 0% 50. 0%

5 00 % 100.0% 50.0%

100.0% 50.0% 50.0% ? 1000% 50.0% 50.0% 4 r 00.0% 100.0% 100.0%

Interpreting Cros stabs Output The output consistsof a contingencytable. Each level of WORK is given a column. Each level of TRAINING is given a row. In addition, a row is added for total, and a column is added for total.

Each cell contains numberof participants the (e.g.,one participant received no trainingand doesnot work; two participants received training,regardless employno of mentstatus). The percentages eachcell are also shown.Row percentages up to 100% for add horizontally. percentages up to 100%vertically. Column add Forexample, all the indiof vidualswho had no training 50oh not work and 50o% did workedpart-time (using the"o/o , within TRAINING" row). Of the individuals who did not work, 50o/ohad trainingand no 50%hadtraining(usingthe"o/o within work" row). Practice Exercise Using Practice Data Set I in AppendixB, createa contingency table using the Crosstabs command. Determine numberof participants eachcombination the the in of variables SEX andMARITAL. Whatpercentage participants married? of is Whatpercentageof participants maleandmarried? is Section3.3 Measuresof Central Tendencyand Measuresof Dispersion for a SingleGroup Description Measures centraltendency valuesthat represent typical memberof the of are a sample population. threeprimarytypesarethemean,median,andmode.Measures or The of dispersion you the variabilityof your scores. primarytypesare the range and tell The the standarddeviation.Together, measure central a of tendency a measure disperand of sionprovidea greatdealof information about entiredataset. the
2l

Chapter Descriptive ,l Statistics

We will discuss thesemeasures central of tendency and measures dispersion the conof in text of the Descriplives command. Note that many of these statisticscan also be calculated with several other commands (e.g., the Frequenciesor Compare Means commandsare required to compute the mode or median-the Statisticsoption for the Frequenciescommandis shownhere).

iffi{ltl*::l'.,xl
Fac*Vd*c-----:":'-'-"-" "-

|7 Arruer |* O*pai*furjF tqLteiotpr F rac$*['*

lcer**r**nc*r1 !*{* f- rlm Cr* -:.-i , f u"g.t


: T Modt

| |

r.-I
k'I +l

16I

:-^ t5m

':'I

l- Vdsm$apn&bcirr i0hx*ioo*".'*-' lf Sld.dr',iitbnl* lli*nn f.H**ntrn ]fV"iro f.5.t.ncr l fnxrgo oidrlatin -- -r5tcffi:

; f Kutu{b

Assumptions Each measureof central tendencyand measureof dispersionhas different assumptions associated with it. The mean is the most powerful measure centraltendency,and it of has the most assumptions. example,to calculatea mean, the data must be measured on For an interval or ratio scale.In addition,the distribution shouldbe normally distributedor, at least,not highly skewed.The median requiresat leastordinal data.Because median the indicatesonly the middle score (when scoresare arrangedin order), there are no assumptions about the shapeof the distribution.The mode is the weakestmeasureof central tendency.There are no assumptions the mode. for The standard deviation is the most powerful measure dispersion,but it, too, has of severalrequirements.It is a mathematicaltransformationof the variance (the standard deviation is the square the root of the variance).Thus, if one is appropriate, other is also. The standard deviation requiresdata measured an interval or ratio scale.In addition, on the distribution should be normal. The range is the weakestmeasureof dispersion.To calculatea range, the variablemust be at leastordinal. For nominal scale data,the entire frequencydistribution shouldbe presented a measure dispersion. as of Drawing Conclusions A measureof central tendencyshouldbe accompanied a measureof dispersion, by Thus, when reporting a mean, you should also report a standard deviation. When presentinga median, you shouldalso statethe range or interquartilerange. Data Format .SPSS Only one variable is required.

22

Chapter3 DescriptiveStatistics

Running the Command The Descriptives command will be the lA-dy* ct.dn Ltffibc command you will most likely use for obtaining measures centraltendencyand measures disperof of sion. This example uses the SAMPLE.sav data file ! D GonardtFra*!@ we have used in the previouschapters. ' cond*s )
. : Rolrar*n classfy 0tdRedrctitrt

,t X
dlt

) ) )

da.v d** ?n-"* ?,r,qx


/t**ts

S&r dr.d!r&!d

Y*rcr ri vdi.bb

To run the command, click Analyze, then Descriptive Statistics,then Descriptives. n".d I This will bring up the main dialog box for the cr*l I Descriptives command. Any variables you f,"PI would like information about can be placed in opdqr".. the right blank by double-clickingthem or by I selectingthem, then clicking on the anow.

qil

ltl
{l

'!t
,l

By default, you will receive the N (number of cases/participants), the minimum value, the maximum value, the mean, and the standard deviation. Note that some of thesemay not be appropriatefor the type of data you have selected. If you would like to changethe default statistics that are given, click Options in the main dialog box. You will be given the Optionsdialog box presented here.

F Morr F su aa**n f u"orl- nrrcr

l- Slm

r@t

,l t
il

F, Mi*ilm F7 Maiilrn I- S.r.npur

qq..' I ,|' ?bl

'i
I

:
I

"i

* I otlnyotdq:
I {f V;i*hlC

",

r I *car*re mar i r Dccemdnnmre

I r lpr,*an

;i I ;

Reading the Output The output for the Descriptivescommand is quite straightforward.Each type of output requested presented a column, and eachvariable is given in a row. The output is in presented here is for the sampledata file. It showsthat we have one variable (GRADE) and that we obtainedthe N, minimum, maximum, mean, and standard deviation for this variable.
DescriptiveStatistics
N

graoe ValidN (listwise)

4 4

Minimum Maximum 73.00 85.00

Mean Std.Deviation 80.2500 5.25198

23

Chapter Descriptive 3 Statistics

Practice Exercise Using PracticeData Set I in AppendixB, obtainthe descriptive statisticsfor the ageof the participants. What is the mean?The median?The mode?What is the standard deviation?Minimum? Maximum?The range? Section 3.4 Measures of Central Tendency and Measures of Dispersion for Multiple Groups Description The measures centraltendencydiscussed of earlierare often needednot only for the entiredataset,but also for several subsets. One way to obtainthesevaluesfor subsets would be to use the data-selection in techniques discussed Chapter2 and apply the Descriptivescommandto each subset. easierway to perform this task is to use the Means An command.The Means commandis designed provide descriptive statisticsfor subsets to ofyour data. Assumptions The assumptions discussed the sectionon Measures Central Tendencyand in of Measures Dispersion a SingleGroup(Section of for 3.3) alsoapply to multiplegroups. Drawing Conclusions A measure centraltendency of shouldbe accompanied a measure dispersion. by of Thus,when giving a mean, you shouldalsoreporta standard deviation. When presenting a median,you shouldalsostatethe range or interquartile range. SPSSData Format Two variablesin the SPSSdata file are required.One represents dependent the variable and will be the variablefor which you receivethe descriptive statistics.The other is the independentvariable and will be usedin creating subsets. the Note that while SPSScalls this variablean independentvariable, it may not meet the strict criteriathat definea true independentvariable (e.g.,treatment manipulation). Thus, someSPSSprocedures referto it as the grouping variable.

Runningthe Command This example uses the SAMPLE.sav data file you created in Chapterl. The Means commandis run by clicking Analyze, then Compare Means, thenMeans. This will bring up the main dialog box for the Means command. Place the selected variablein the blank field labeled Dependent List.
! RnalyzeGraphs Utilities WindowHetp F r.l nsportt ' Descriptive Statistirs ) General Linear ftladel F ) Csrrelata I Regression (fassify F

gt5il | Firulb
Ona-Sarnplefeft. f Independent-Samdes T Te T Test,,, Falred-SarnplEs Ons-Way*|iJOVA,,,

-l

' . '

1A LA

Chapter3 DescriptiveStatistics

Placethe grouping variable in the box labeledIndependent List.In this example, through use of the SAMPLE.sav data file, measures central tendencyand measures of of dispersion for the variable GRADE will be given for each level of the variable MORNING.

:I tu I

List Dependant

,du**
/wqrk tr"ining

arv

r T ril

ryl
Ii ..!'l?It.
Heset I Cancel I

ll".i I
lLayarl al1*-

I :'r:rrt|
I i

lr-,
r

Independent Li$: r:ffi

l"rpI

l*i.rl

tffi,
I

L-:By default,the mean, numberof cases, and standard deviation are given. If you would like additional measures,click Options and you will be presented with the dialog box at right. You can opt to include any numberof measures. Reading the Output The output for the Means command is split into two sections.The first section,called a case processingsummary, gives informationabout the data used. In our sample data file, there are four students(cases),all of whom were included in the analysis.

fd Stdirtlx:
Medan

mil'*-*Ca*o* of
lltlur$u

Doviaion

5tt Minirn"rm Manimlrn Rarqo Fist La{ VsianNc Std.Enor Kutosis d Skemrcro Sld.Eno ol $karm HanorricMcan :J

ml
I
c""d I

lStardad

Lqlry-l

x,r I

Summary GaseProcessing Cases Excluded N Percent

lncluded N

Total N

grade - morning

Percent 100.0%

.OYo

4 |

Percent 100.0%

25

Chapter3 Descriptive Statistics

The secondsectionof the outRepott put is the report from the Means comGRADE mand. MOR N IN G Mean N Std. Deviation This report lists the name of NO 82.5000 2 3.53553 the dependent variable at the top Yes 78.0000 7.07107 (GRADE). Every level of the indeTotal 80.2500 4 5.25198 pendent variable (MORNING) is shown in a row in the table. In this example,the levels are 0 and l, labeledNo and Yes. Note that if a variable is labeled,the labelswill be usedinsteadof the raw values. The summary statisticsgiven in the report correspondto the data, where the level of the independentvariable is equalto the row heading(e.g.,No, Yes). Thus,two participantswere includedin eachrow. An additional row is added, named Total. That row contains the combined data. and the valuesare the sameas they would be if we had run theDescriptiyescommandfor the variableGRADE. Extension to More Than One Independent Variable If you have more than one independent variable, SPSS can break down the output even further. Rather than adding more variables to the Independent List section of the dialog box, you need to add them in a different layer. Note that SPSS indicates with which layer you are working.
id

If you click Next, you will be presentedwith Layer 2 of 2, and you can select a secondindependent variable (e.g., TRAINING). Now, when you run the command(by clicking On, you will be given summary statistics for the variable GRADE by each level of MORNING and TRAINING. Your output will look like the output at right. You now have two main sections(No and yes), along with the Total. Now, however, each main section is broken down into subsections(No, yes, and Total). The variable you used in Level I (MORNING) is the first one listed, and it defines the main sections.The variable you had in Level 2 (TRAINING) is listed secReport ORADE
MORNING TRAINING No Yes NO Total Yes Yes NO Total

Total

Yes
NO

Total

Mean 85.0000 80.0000 82.5000 83.0000 73.0000 78.0000 84.0000 76.5000 80.2500

Std.Deviation

1
1
I

3. 53553

1 1
1

a z

7.07107 1. 41421 4.54575 5. ?5198

26

Chapter3 DescriptiveStatistics

who were not morningpeopleand ond. Thus,the first row represents thoseparticipants participants werenot morningpeowho who received row training. The second represents ple anddid not receive who The third row represents total for all participants the training. werenot morningpeople. are Noticethat standarddeviations not givenfor all of the rows.This is because per One with usingmanysubsets thereis only oneparticipant cell in thisexample. problem is that it increases numberof participants required obtainmeaningful to results. a See the for research design textor your instructor moredetails. Practice Exercise B, the UsingPractice DataSet I in Appendix compute meanandstandarddeviaWhat is the average of the marriedpartion of agesfor eachvalueof maritalstatus. age participants? ticipants? singleparticipants? divorced The The Section3.5 Standard Scores Description of scales transforming scores the Standard scores allow thecomparison different by into a commonscale. standard scoreis the z-score. z-score based A is The mostcommon (e.g., meanof 0 anda standarddeviation l). A a on a standardnormal distribution of z-score, of or therefore, represents number standarddeviations the above belowthe mean (e.9., z-score -1.5 represents a I deviations of a score % standard belowthemean). Assumptions Z-scores based the standardnormal distribution. Therefore, distribuare the on tionsthatareconverted z-scores be be to should normallydistributed, thescales and should eitherinterval or ratio. Drawing Conclusions of of above Conclusions based z-scores on consist thenumber standarddeviations 85 or belowthe mean.For example, student scores on a mathematics that a examin a class hasa meanof 70 andstandarddeviationof 5. The student's scoreis l5 pointsabove test z-score 3 because scored standard is the class mean(85 - 70: l5). The student's she 3 :3). If the same deviations student scores on a reading 90 exam, above mean(15 + 5 the of will with a class meanof 80 anda standarddeviation 10,thez-score be I .0 because she is one standard deviation abovethe mean. Thus,even thoughher raw scorewas higheron thereading to test,sheactually betterin relation otherstudents the mathedid on matics because z-score higheron thattest. was test her .SPSS Data Format That variable in mustbe Calculating z-scores requires only a singlevariable SPSS. numerical.

27

Chapter3 Descriptive Statistics

Running the Command Computingz-scores a component the is of eqhs Uti$tbl WMow Help Myzc Descriptivescommand.To access click Analyze, it, ) b,lrstlK- al then Descriptive Statistics, then Descriptives. This example uses the sample data file (SAMPLE.sav) createdin ChaptersI and2. @nerdLlneuFbdel )
Correlate )

This will bring up the standard dialog box for the Descrip/ives command. Notice the checkdoay drnue box in the bottom-left corner ladMonNtNs tsr.dI beled Save standardized values as dwnnn c"od I variables.Check this box and move drR$HtNs HdpI the variable GRADE into the righthand blank. Then click OK to com19 Srva*ndudi3ad vduos vcriaHas ts plete the analysis. You will be preldry | sented with the standard output from the Descriptivescommand.Notice that the z-scoresare not listed. They were inserted into the data window as a new variable. Switch to the Data View window and examine your data file. Notice that a new variable, called ZGRADE, has been added.When you asked SPSS to save standardized values,it createda new variable with the samename as your old variable preceded a Z. by The z-scoreis computedfor eachcaseand placedin the new variable.

t*l

Eb E*

Sw Qpt. lrnsfam

elslel&l *il{|lelej sJglel ffilslffilfw,qlqj


11i-io-

end/2. gr$t6

lr| -tsJX

rc

$citffrtirffi
Yas Yes
No

Tua/Thulaiemoon

Mi-

Reading the Output After you conductedyour analysis,the new variable was created.You can perform any numberof subsequent analyses the new variable. on Practice Exercise Using PracticeData Set 2 in Appendix B, determinethe z-scorethat corresponds to each employee's salary.Determinethe mean z-scoresfor salariesof male employeesand femaleemployees. Determinethe mean z-scorefor salaries the total sample. of

28

4 Chapter

Data Graphing
Section 4.1 GraphingBasics
In addition to the frequency distributions,the measuresof central tendency and measures dispersiondiscussed Chapter3, graphingis a useful way to summarize,orof in ganize,and reduceyour data. It has been said that a picture is worth a thousandwords. In the caseof complicateddatasets,this is certainlytrue. With Version 15.0of SPSS,it is now possibleto make publication-quality graphs using only SPSS.One importantadvantage using SPSSto createyour graphsinsteadof of other software (e.g., Excel or SigmaPlot)is that the data have alreadybeen entered.Thus, duplication eliminated, is and the chance makinga transcription of error is reduced.

Section 4.2 The New SPSS Chart Builder Data Set


For the graphingexamples, will use a new set of data.Enter the databelow by we defining the three subject variablesin the Variable View window: HEIGHT (in inches), WEIGHT (in pounds),and SEX (l = male, 2 = female).When you createthe variables, designateHEIGHT and WEIGHT as Scale measures and SEX as a Nominal measure(in

the far-rightcolumnof the VariableView).Switchto the Data Viewto Measure enterthe datavalues the 16participants. for Now usethe Save comAs Scale mand save file,naming HEIGHT.sav. to the it -HEIGHT 66 69
/5

72 68 63 74 70 66 64 60 67 64 63 67 65

WEIGHT 1 50 1 55 1 60 1 60 1 50 1 40 1 65 1 50 ll0 1 00 95 ll0 1 05 100 ll0 1 05

SEX I I I I I I I I 2

bCIb iNiomiiiai

2 2 2 2 2 2 2

29

Chapter GraphingData 4

Make sure you have enteredthe data correctly by calculating a mean for each of the three variables(click Analyze,thenDescriptive Statistics,then Descriptives).Compare your resultswith thosein the tablebelow.
Descrlptlve Statistics

srd.
N

Minimum Maximum 16 16 16 16

Mean

Dpvi2lion

l-ttstuFl I WEIGHT SEX ValidN (listwise)

60.00
06 nn

1.00

74.00 66.9375 165.00 129.0625 2.00 1.5000

J.9Ub//

26.3451 .5164

Chart Builder Basics Make sure that the HEIGHT.sav data file you createdabove is open. In order to usethe chart builder, you must have a data file open. NewwithV ersio n l5 . 0 o f S P S S is t h e Ch a rt B u ild e rc o m. W windt mand. This command is accessedusing Graphs, then Chart rct"ph; Lulities Builder in the submenu. This is a very versatilenew commandthat can make graphsof excellentquality. When you first run the Chart Builder command,you will probably be presented with the following dialog box:

Bcforeyur rrc thlsdalog,moasuranar* shold bc sct gecrh fw cadr vadabb hvel h yourdurt. In dtbn, f yow chartcodahs cataqo*d v6d&. v*re hbds sha.rld &fhcd for eachcrtrgory br kass O( to doflrc yorr chart, Pr6srDafine V.riaHafroportbs to mt masrcnrant brd or ddhe v*.te l&b for rhart vsi$bs,

:,

f* non't*row $rUdalogagaFr

ol(

Ocfknvubt# kopcrtcr.,.

This dialog box is you to ensure your asking that variables are properly defined. Refer to Sections 1.3 and2.1 if you had difficulty definingthe variables usedin creatingthe datasetfor this example,or to refreshyour knowledge thistopic.Click of

oK.

The Chart Builder allows you to make any kind of graph that is normally used in publication or presentation, and much of it is beyond the scopeof this text. This text, however,will go over the basics of the Chart Builder so that you can understand mechanics. its On the left side of the Chart Builder window are the four main tabs that let you control the graphsyou are making. The first one is the Gallery tab. The Gallery tab allows you to choosethe basic format ofyour graph.

cc[ffy

Eesk notnents

l"ry{Y:_
litleo/Footndar

30

Chapter4 Graphing Data

For example, the screenshothere showsthe different kinds of bar chartsthat the Chart Builder can create. After you have selectedthe basic 0rr9 a 63llst ctrt fsg b re it e form of graph that you want using the y* 6t'fig pohr Gallery tab, you simply drag the image OR from the bottom right of the window up to Clkl m f 86r Ele|mb * b tulH r dwt lsffirt bf ele|Ft the main window at the top (where it @9Pk8: reads,"Drag a Gallery chart here to use it as your startingpoint"). Chrtpftrbv [43 airr?b deb Alternatively, you can use the Badnsrfiom: sic Elemenlstab to drag a coordinatesys- 8arts ElpnF& Ll3 tem (labeledChooseAxes) to the top winAroa PleFokr dow, then drag variables and elements Scalbillot Hbbqran into the window. HUH-ot, 8oph The other tabs (Groups/Point ID DJ'lAm and Titles/Footnotes)can be used for adding other standard elements to your graphs. ,bh I n"ct I cror | The examples in this text will cover some of the basic types of graphs you can make with the Chart Builder. After a little experimentation your own, once you on have masteredthe examplesin the chapter,you will soon gain a full understanding the of ChartBuilder.

Section4.3 Bar Charts, Pie Charts, and Histograms Description pie represent number timeseachscore the ocBar charts, charts, histograms and of of Theyaregraphical represencursthrough varyingheights barsor sizes pie pieces. the of in tations thefrequency discussed Chapter 3. of distributions Drawing Conclusions produces TheFrequencies outputthatindicates of command boththenumber cases percentage cases particular with that value.Thus, in the sample with a valueand the of the for conclusions or drawn shouldrelateonly to describing numbers percentages the perconclusions regarding cumulative sample. the dataareat leastordinal in nature, If the percentiles alsobedrawn. centages and/or can SPSSData Format You needonlv onevariable usethiscommand. to

3l

Chapter GraphingData 4

Running the Command The Frequenciescommand will produce *nalyze Gr;pk Udties Window Hdp | graphicalfrequencydistributions.Click Analyze, t LiwL lW .a'fJul then Descriptive Statistics, then Frequencies. You will be presented (6fnpSg MBan* with the main dialog box ) ) for the Frequencies command, where you can GeneralLinearMsdel enter the variables for which vou would like to creategraphsor charts.(SeeChapter3 for other optionswith this command.)
);,r.:

Click the Charts button at the bottom to producefrequencydistributions.This will give you the Chartsdialog box. n"*dI There are three types of chartsavailc"!q I able with this command: Bar charts, Pie l1t"l charts, andHistograms. For each type, the I axis can be either a frequency count or a (selectedwith the Chart Values percentage option).
xl

0Kl

You will receive the charts for any variables lectedin the main Frequencies commanddialog box.

Output
The bar chart consistsof a I'axis, representing the frequency,and an Xaxis, representing eachscore.Note that the only values represented the X axis are those values on with nonzero frequencies (61, 62, and 7l are not represented).
h.lgtrt

,
I a L

G a

65.00 66.!0

67.m

68.00

70.s

h.lght

Chapter4 Graphing Data

NEUMAf{l{ COLLEiSE Lt*i:qARy pA A$TO|',J, .igU14


hclght
s0.00 t3r0 alr 05!0 66.00 67.!0 Gen0 !9.!0 tos nfit 13!o il.m

The pie chart shows the percentageof the whole that is representedby eachvalue.

The Histogram commandcreatesa groupedfrequency distribution. Therangeof scores splitintoevenly is spaced groups.The midpointof each groupis plottedon theX axis,andthe I axisrepresents number scores the of for each group. If you select With Normal Curve,a normalcurve will be superimposed over the distribution. This is very useful in determining the disif tribution you have is approximately normal. The distributionrepresented here is clearly not normaldue to the asymmetry thevalues. of
Practice Exercise

h166.9l S. Oae,.lr07 flrl0

h.lght

Use PracticeData Set I in Appendix B. After you have enteredthe data,constructa histogramthat represents mathematics the skills scoresand displaysa normal curve, and a bar chart that represents frequencies the variableAGE. the for

Section 4.4 Scatterplots Description


Scatterplots(also called scattergrams scatterdiagrams)display two values for or eachcasewith a mark on the graph.The Xaxis represents value for one variable.The I the axis represents value for the secondvariable. the

JJ

Ch a p te -1 Gra p h i n g a ta r D

Assumptions Both variables should interval or ratio scales. nominalor ordinal dataare be If used, cautious be your interpretation thescattergram. about of .SPSS Data Format You need two variables perform command. to this
Running the Command You can produce scatterplots clicking Graphs, then Chart I 6raph* ulfftlqs Wnd by Builder. (Note: You can also use theLegacyDialogs. For this method, please AppendixF.) see ln Gallerv Choose r l 0l selectScatter/Dol.Then drag the Simple from: Scatter icon (top left) up to the main chart areaas shown in the screenshot left. Disreat orrq a 6ilby (h*t fes b & it e gard the ElementPropertieswindow that pops tl ".:' ;o on, l ln up by choosingClose. iLs clr* s fE Bs[ pleitbnb t b b krth Next, dragthe HEIGHT variableto the 3 cfst Bleffit by l8ffit X-Axis area,and the WEIGHT variable to the Y-Axisarea(rememberthat standardgraphing (& mtrpb dstr Chrifrwr* conventions indicate that dependent variCtffii'w: Frwih ables shouldbe I/ and independentvariables Si LtE shouldbe X. This would mean that we are trylr@ Fb/Fq|n ing to predict weights from heights).At this gnt$rrOol l,lbbgran point, your screenshould look like the examHlgfFl"tr l@bt Ral Ars ple below. Note that your actual data are not shown-just a set of dummy values.
Wrilitll'.,: ,, .Jol

^ry.J Y*J - '"? |


Click OK. You shouldget your new graph(nextpage)asOutput.

V*l&bi:

8n
Lh PlrifsLa Scfflnal xbbrs Hg||rd
iEbM{ Ffip*t!4.,

opbr.,

, x**J " s*J ...ryFl

34

Chapter4 Graphing Data

Output Theoutput consist a markfor each will participant theappropriate and of at X levels.

arlo

i?Jo hdtht

?0.00

t:.${

Adding a Third Variable Even thoughthe scatterplot a is two-dimensional graph,it canplot a third variable.To make it do so, selectthe Groups/Point tab in the ChartBuilder. ID Click the Grouping/stacking variable option. Again,disregard Element the Properties window that pops up. Next, drag thevariable SEX into theupper-right corner where it indicates Color. When Set this is done,your screen shouldlook like the imageat right. If you are not ableto dragthe variable SEX,it may be because it is not identified nominalor ordinal as in the VariableViewwindow. Click OK to have SPSSproduce thegraph.

!|||d

d*|er

btrdtn-

b$tdl

l- cotrnrcpr:tvr$ I- aontpl*rt

35

4 Chapter GraphingData

the Now our output will have two different setsof marks. One set represents male participants,and the secondset represents female participants.Thesetwo setswill apthe You can use the SPSSchart editor (seeSection pear in two different colors on your screen. 4.6) to make them different shapes, shown in the examplebelow. as
sPx

iil
os

60.00

65,00

67.50

helght

Practice Exercise the to a B. DataSet2 in Appendix Construct scatterplot examine relaUsePractice tionship between SALARY andEDUCATION. Section4.5 Advanced Bar Charts Description command(see Section4.3). with the Frequencie.s Bar chartscan be produced wherethe I/ axis is not a frequency. we in however. are interested a bar chart Sometimes. we To produce sucha chart, need usetheBar chartscommand. to
SPSS Data Format You need at least two variables to perform this command. There are two basic designsand those for repeated-measures kinds of bar charts-those for between-subjects methodif one variableis the independentvariable and designs. Use the between-subjects method if you have a dethe other is the dependent variable. Use the repeated-measures pendent variable for eachvalue of the independentvariable (e.g.,you would havethree

36

Chapter4 GraphingData

variablesfor a designwith threevaluesof the independentvariable). This normally occurs when you make multiple observations over time. This exampleusesthe GRADES.savdata file, which will be createdin Chapter6. Please section6.4 for the dataif you would like to follow along. see Running the Command Open the Chart Builder by clicking Graphs, then Chart wh& Builder. In the Gallery tab, select Bar. lf you had only one inde- tG*ptrl uti$Ues pendent variable, you would selectthe Simple Bar chart example (top left corner).If you have more than one independentvariable (as in this example), tfldr( select the Clustered Bar Chart example l?i;ffitF.td'd{4rfr trrd... from the middle of the top row. /ft,Jthd) /l*n*|ts,., Drag the example to the top workdq*oAtrm, , h4 | G.lary ahd lsr to @ t 6 p cfwxry ing area. Once you do, the working area m should look like the screenshotbelow. ffi * $r 0* dds t bto h.td. drr drrrl by.lr!* (Note that you will need to open the data file you would like to graph in order to run this command.)
9{ m hlpd{ sc.ffp/Dat tffotm tldrtff 60elot oidA#

:gi

y"J

.*t I

r,* |

yu vttdld {a b. rsd te grmt! lh. y*rfts yw d.t, qa..dr vrt d. {db. Edr.*6ot.' rh ffi h *. dst, vtlB enpcr*.dby |SddSri,lARV vrtdb cdon d b Ur Y d. Vrtdrr U* d.ftr (&gqb n.ryst d !d c **h o b. red o. c&eskd d q 6 *rdd nDe(rd*L, . gdslo a F Ftrg Yrt aic.

Cdtfry LSdrl

o,-l

ryl

*. r! " l

If you are using a repeated-measures design like our example here using GRADES.sav from Chapter 6 (three different variablesrepresenting the i values that we want), you needto selectall threevariables (you can <Ctrl>-click them to selectmultiple variables)and then drag all threevariablenamesto the Y-Axisarea.When you do. vou will be given the warning message above. Click OK.

JI

Chapter GraphingData 4

,'rsji,. *lgl$
r*dlF*... dnif*ntmld,.. /tudttbdJ {i*rEkucrt}&"., &rcqsradtrcq,,.

Next, you will need to drag the INSTRUCT variable to the top right in the Cluster: set color area (see screenshot at left).

iJr g; i ? :
I' '!

;:Nl

iai

*rrrt plYkrlur.r ollmbdaa.


8{ Lll. ,fat H.JPd., t(.&|rih Krtogrqn HCtstoef loxpbt orrl Axas

Note: The Chart Builder pays attention to the types of variables that you ask it to graph. If you are getting etTor messages or unusual results, be sure that your categorical variables are properly designated Nominal as in the Variable View tab (See Chapter2, Section2.l).

n"i*

l.

crot J

rr!

Output
inilrut

lr nt

&t:

Practice Exercise Use PracticeData Set I in Appendix B. Constructa clusteredbar graph examining the relationship betweenMATHEMATICS SKILLS scores(as the OepenOent variabtej and MARITAL STATUS and SEX (as independentvariables). Make sure you classify both SEX and MARITAL STATUS as nominalvariables.

38

Chapter4 Graphing Data

Graphs Section 4.6 Editing SPSS


Whatever command you rI use to create your graph, you will cFtA$-qli*LBul0l al ll probably want to do some editing $r to make it appear exactly as you want it to look. In SPSS,you do this in much the sameway that you edit graphs in other software programs(e.g., Excel). After your *. graph is made, in the output window, select your graph (this will createhandlesaround the outside of the entire object) and rightclick. Then. click SPSS Chart Object, and click Open. Alternatively, you can double-clickon the graphto open it for editing. ProperWhen you open the graph, the Chart Editor window and the corresponding lies window will appear.
FFF,Ffu FF|* "'4 F&'E' q

qb li. lin.tlla.

,, ; l 61f

L: lr ! . H; gb. t c t - ] pu1 r i
F4*.it.r":!..*

*rll..!!lflE.!l

ltliL&{

;l Jxr il.dk'nl

IE

r 9-, rt
I

Ittttr tlttIr tllrwel w&&$!{!rJ JJJJ-JJ


JJJJJJ .nlqrl cnl,f,,!sl

:,-

r--."1

fil

mlryl

OnceChart Editor is open,you can easily edit eachelementof the graph.To select an element,just click on the relevantspot on the graph. For example,if you have addeda title to your graph ("Histogram" in the examplethat follows), you may selectthe element representing title of the graphby clicking anywhereon the title. the

39

Chapter GraphingData 4

jn

Li r::-' ' . '*t A: trTT.":.TJ *"


^' ir sGssir :J*r o :l A I 3 *l.A-I,--

Ex Yt

l tb " : k l g tH ,U:;

Once you have selected an element, you can tell whether the correct element is selectedbecauseit will have handlesaroundit. If the item you have selected a text element(e.g., is the title of the graph), a cursor will be present and you can edit the text as you would in a word processing program. If you would like to change another attribute of the element (e.g., the color or font size), use the Properties box. (Text properties are shownbelow.) With a linle practice, you can make excellent graphs using SPSS.Once your graph is formatted the way you want it, simply select File, Save, then Close.

o,tl*" ffiln*fot*.1
P?*l!r h ?frtmd Sa . .

AaBbCc123

$o gdt lbw gsion Ek $vr {hat Trm$tr,,, Spdy$a*Tmpt*c.,. flpoft {bdt rf'.|1,,,
Ua*tr$Sie

gltaridfu;

40

5 Chapter

Prediction Association and


Section5.1 PearsonCorrelation Coefficient Description (sometimes The Pearson product-moment correlation coefficient calledthe Pearson correlation r) coefficient simplythe Pearson determines strength the linearrelathe or of tionship between variables. two Assumptions (or Both variables on should measured intervalor ratio scales a dichotomous be nominal variable). a relationship If them, exists between thatrelationship should linear. be Because Pearson the with z-scores, correlation coefficientis computed both variables shouldalsobe normallydistributed. your datado not meetthese If assumptions, consider usingthe Spearman correlation instead. rho coefficient SP.SS Data Format Two variables required your SPSS are in datafile. Eachsubject musthavedatafor bothvariables.
Running the Command To selectthe Pearsoncorrelationcoefficient, lfratyil qapnsUtl&i*s t#irdow Heb click Analyze, then Conelate, then Bivariate ) Reportr (bivariate refers to two variables).This will bring DescripHve Salirtk* ) ) Ccmpara Hranr up the Bivariate Correlations dialog box. This ue"qer:d lirwarmo{d ) example uses the HEIGHT.sav data file enteredat the startof Chapter4.
4

n
1

..

rqsl

Vdri.blcr

n"."tI

I
. .i lwolalad

ry{l i*l

Move at leasttwo variablesfrom the box at left into the box at right by using the transfer arrow (or by double-clicking each variable). Make sure that a check is in the Pearson box under Correlation Cofficients. It is acceptableto move more than two variables.

{. 0rG-tr8.d

9@,. 1

4l

Chapter5 Prediction and Association

For our example,we will move all threevariables over and click OK. Reading the Output

Vdi{$b*

The output consists of a :rydl !4 1 correlation matrix. Every variableyou entered in the command is represented as both a row and a column. We entered three variables in our command. I Tc* d $lrfmma*--*=*-*:-**-*l Therefore,we have a 3 x 3 table. There are also three rows in each cell-the 17 Flag{flbrrcorda&rn correlation,the significance level, and the N. If a correlation is signifiCorrelations cant at less than the .05 level, a single * will appear next to the heioht weioht sex netgnt Pearsonuorrelalron 1 .806' -.644' correlation.If it is significant at Sig. (2-tailed) .000 .007 the .01 level or lower, ** will apN 16 16 16 pear next to the correlation. For weight PearsonCorrelation .806' .968' example, the correlation in the Sig. (2-tailed) .000 .000 output at right has a significance N 16 16 16 PearsonCorrelation 1 -.644' -.968' level of < .001, so it is flagged sex Sig. (2-tailed) .007 .000 with ** to indicatethat it is less N 16 16 16 than.0 1 . ". Correlation significant the 0.01 levet(2-tailed). is at To read the correlations. select a row and a column. For example,the correlationbetweenheight and weight is determinedthrough selectionof the WEIGHT row and the HEIGHT column (.806).We get the sameanswerby selecting the HEIGHT row and the WEIGHT column. The correlationbetween a variable and itself is always l, so thereis a diagonalsetof I s.

m l/'*

lsffi

-NOX I

nql

l_i::x-

.--i

Drawing Conclusions -1.0 and +1.0. Coefficients The correlation coefficientwill be between closeto 0.0 represent weak relationship. a a Coefficients closeto 1.0or-1.0 represent strongrelationship. Generally,correlationsgreaterthan 0.7 are consideredstrong. Correlationsless than 0.3 are considered weak. Correlationsbetween0.3 and 0.7 are considered moderate. Significant correlationsare flagged with asterisks.A significant correlation indicatesa reliable relationship,but not necessarily strong correlation.With enoughparticia pants,a very small correlationcan be significant.PleaseseeAppendix A for a discussion of effect sizesfor correlations.

Phrasing Significant a Result


In the example above, we obtained a correlation of .806 between HEIGHT and WEIGHT. A correlation .806 is a strongpositivecorrelation, of and it is significantat the .001level.Thus,we could statethe following in a resultssection:

4/

Chapter5 Predictionand Association

between for A Pearson correlationcoefficientwas calculated the relationship positive correlation was found participants' height and weight. A strong between linear relationship (r(14) : .806,p < .001),indicatinga significant to weigh more. tend Taller participants the two variables. The conclusionstatesthe direction(positive),strength(strong),value (.806), degreesof freedom(14), and significancelevel (< .001) of the correlation.In addition,a statement direction is included(taller is heavier). of is of Note that the degrees freedomgiven in parentheses 14. The output indicatesan command of freedom,the correlation give degrees N of 16. While most SPSSprocedures of gives only the N (the numberof pairs).For a correlation,the degrees freedomis N - 2. Phrasing Results That Are Not Significant Using our SAMPLE.savdataset from the previous chapters,we could calculatea correlationbetweenID and GRADE. If so, we get the outPut at has right. The correlation a significance level of .783. Thus, we could write the following in a resultssection(note that of the degrees freedomis N - 2):
Correlations lD Pearson Uorrelatlon (2{ailed) Sig. N PearsonCorrelation Sig.(2-tailed) N ID 1.000 4 .217 .783 4 GRADE .217
7A?

GMDE

1.000 4

A Pearsoncorrelation was calculatedexamining the relationshipbetween participants' ID numbers and grades. A weak correlation that was not to p was found(, (2): .217, > .05).ID numberis not related grade significant in the course. Practice Exercise conelaUse PracticeData Set 2 in Appendix B. Determinethe value of the Pearson tion coefficient for the relationshipbetweenSALARY and YEARS OF EDUCATION.

Coeflicient Correlation 5.2 Spearman Section Description


the correlationcoefficientdetermines strengthof the relationshipbeThe Spearman Therefore,it is weakerthan the Pearprocedure. two variables.It is a nonparametric tween son correlationcoefficient.but it can be usedin more situations. Assumptions correlationcoefficient functions on the basisof the ranks of Because Spearman the (or interval or ratio) data for both variables.They do not needto data,it requiresordinal be normally distributed.

43

Cha p te r Pre d i c ti o n n dA s s o c i a ti on 5 a

SP.SS Data Format Two variables required your SPSS mustprovidedata are in datafile. Eachsubject for bothvariables.
Running the Command Click Analyze, then Correlate, then Grapk Utilitior wndow Halp |;,rfiy* Bivariate.This will bring up the main dialog box ) RrFarts for Bivariate Correlations(ust like the Pearson Statistics ) Oescri$ive I correlation). About halfway down the dialog ComparcMeans ) box, there is a section for indicating the type of " Generd Linear f{udel ) correlation you will compute. You can selectas many correlationsas you want. For our example, removethe check in the Pearsonbox (by clicking on it) and click on theSpearmanbox.
j

i* CsreldionCoefficients

j f f"igs-"jjl- fienddrs tzu.b

Use the variablesHEIGHT and WEIGHT 4). from our HEIGHT.savdatafile (Chapter This is also one of the few commandsthat allows you to choose one-tailed a test.if desired.

Reading the Output


Correlations The output is essentially the sameas for the PearH E IGH T WEIGHT tr-4. Spearman'srho HEIGHT CorrelationCoeflicient 1.000 son correlation.Each pair of Sig. (2-tailed) .000 variables has its correlation N 16 16 coefficient indicatedtwice. The ffi .883 1.000 Sig. (2-tailed) .000 Spearman rho can range from 't6 N 16 -1.0 to +1.0,just like the Pear". Correlationis significantat the .01 level (2-tailed) son r. The output listed above indicatesa correlationof .883 between HEIGHT and WEIGHT. Note the significancelevel of .000,shown in the "Sig. (2-tailed)"row. This is, in fact, a significancelevel of <.001. The actualalpha level roundsout to.000, but it is not zero.

Drawing Conclusions The correlation -1.0 and +1.0. Scores will be between closeto 0.0 represent weak a relationship. Scores closeto 1.0or -1.0 represent strongrelationship. a Significantcorrelations are flagged with asterisks.A significant correlation indicatesa reliable relationship, but not necessarily strong correlation.With enoughparticipants,a very small correlation a can be significant. Generally,correlationsgreaterthan 0.7 are consideredstrong. Correlations less than 0.3 are consideredweak. Correlationsbetween0.3 and 0.7 arc considered moderate.

44

Chapter Prediction 5 and Association

That PhrasingResults Are Significant


In the exampleabove,we obtaineda correlationof .883 betweenHEIGHT and and it is significantat the WEIGHT. A correlation .883 is a strongpositivecorrelation, of .001level.Thus,we could statethe following in a resultssection: A Spearmanrho correlationcoefficient was calculatedfor the relationship betweenparticipants'height and weight. A strong positive correlationwas a t found (rh o (1 4 ):.883, p <. 0 0 1 ), in d ic a t in g s ig n if ic a n re la t io n s h ip tendto weigh more. Taller participants between two variables. the The conclusionstatesthe direction(positive),strength(strong),value (.883), degreesof freedom(14), and significancelevel (< .001) of the correlation.In addition,a Note that the degrees freedomgiven of statement directionis included(talleris heavier). of the of an is in parentheses 14.The outputindicates N of 16.For a correlation, degrees freedom is N- 2. Phrasing Results That Are Not Significant
Correlations Using our SAMPLE.sav to GRADE data set from the previouschapters, .UUU CorrelationCoenicten Spearman'srho lD 000 rho we could calculate Spearman a 1.000 S i g. (2{ai l ed) and correlation between ID N 1.000 ffi .000 GRADE. If so, we would get the Sig. (2{ailed) 1.000 output at right. The correlationcoN 4 efficient equals.000 and has a sigNote that thoughthis value is roundedup and is not, in fact, exnificance level of 1.000. we could statethe following in a resultssection: actly 1.000,

rho correlationcoefficient was calculatedfor the relationship A Spearman betweena subject's ID number and grade.An extremely weak correlation was found (r (2\ = .000,p > .05).ID numberis not that was not significant relatedto gradein the course. Practice Exercise of the Data Set 2 in AppendixB. Determine strength the relationship Use Practice job classification calculating Spearman correlation. the r&o by between salaryand Section 5.3 Simple Linear Regression Description of allowsthe prediction one variablefrom another. Simplelinearregression Assumptions are that both variables interval- or ratio-scaled. assumes Simple linear regression aroundthe prediction In addition,the dependentvariable shouldbe normally distributed that the variablesare relatedto each other linearly.Typiline. This, of course,assumes

45

Ch a p te 5 P re d i c ti o n n dA s s o c i ati on r a

cally, both variablesshould be normally distributed. Dichotomous variables (variables with only two levels)are alsoacceptable independentvariables. as .SPSS Data Format Two variablesare required in the SPSSdata file. Each subject must contribute to both values. Running the Command Click Analyze, then Regression,then Linear. This will bring up the main diatog Aulyze Graphs LJtl$ties Whdow Help R;porte box for Linear Regression. the left sideof On ' Descrptive5tatistkf > the dialog box is a list of the variablesin Comparc ) Mems your data file (we are using the HEIGHT.sav General linear frlod l data file from the start of this section). On ' Corrolate the right are blocks for the dependent variable (the variable you are trying to Clasifu ) predict),and the independent variable (the Data ) Reductbn variablefrom which we are predicting). We are interestedin predicting someone's weight on the basisof his or her height. Thus, we should place the variable WEIGHT in the dependent variable block and the variable HEIGHT in the independent variable block. Then we can click OK to run the analysis. Reading the Output

lt{*rt*

0coandart

t '-J ff*r,'-j iL,:,,,r,,,'l


Itd.p.nd6r(rl

u* I i -Iqil I Crof

rrr l Pmr{r
Ucitbd lErra :J SdrdhVui.bh

i E

Ar-'"1
Est*6k

For simple linear regressions, we are interestedin three components of the output. The first is called the sui*br... pbr.. I Srrs... I Oaly*..I I Model Summary,and it occursafter the Variables Entered/Removed section. For our example,you shouldseethis output.R Square (calledthe coeflicient of determination) gives you the proportionof the variance of your dependentvariable (yEIGHT) that can be explainedby variationin your independentvariable (HEIGHT). Thus, 649% of the variation in weight can be explainedby differences height (talier in individuals weigh more). The standard error of Modetsummarv ModelSummary
WLSWaidrl:

I'J

estimategivesyou a measure Adjusted Std.Errorof of dispersion your predic- Model for R R Square R Souare the Estimate tion equation. When the 1 .E06 .649 .624 16.14801 predictionequationis used. a. Predictors: (Constant), height 68%of thedatawill fall within

46

Chapter5 Predictionand Association

one standard error of estimate (predicted)value. Just over 95ohwill fall within two stanof dard errors.Thus, in the previous example,95o/o the time, our estimatedweight will be :32.296). (i.e.,2 x 16.148 pounds beingcorrect within 32.296 of
ANOVAb

Sumof
Model
Sorrares

df

Kegressron 6760.323 Residual 3650.614 Total 10410.938

I 14 15

Mean Souare 6760.323 260.758

Sio.

25.926

.0004

a' Predictors: (Constant), HEIGHT b. Dependent Variable: WEIGHT in The secondpart of the output that we are interested is the ANOVA summarytable, as shown above.The important numberhere is the significance level in the rightmost If column. If that value is lessthan .05, then we have a significantlinear regression. it is largerthan .05,we do not. This is wherethe actual The final sectionof the output is the table of coefficients. predictionequationcan be found.
Coefficientt' Standardized Unstandardized Coefficients Coefficients Beta B Std.Error Model 1 (Constant) -234.681 71.552 height 1.067 .806 5.434 a. Dependent weight Variable:

/: "

S i o.

-3.280 5.092

.005 .000

equation.f' (pronounced In most texts, you learn that Y' : a + bX is the regression (primes are normally predictedvalues or depend"Y prime") is your dependent variable ent variables), and X is your independentvariable. In SPSSoutput,the valuesof both a andb are found in the B column.The first value,-234.681,is the value of a (labeledConstant).The secondvalue,5.434,is the value of b (labeledwith the name of the independent variable). Thus, our prediction equation for the example above is WEIGHT' : -234.681+ 5.434(HEIGHT). In otherwords,the average subjectwho is an inch taller than anothersubjectweighs 5.434 poundsmore. A personwho is 60 inchestall shouldweigh -234.681+ 5.434(60):91.359pounds. discussion standarderror of of Givenour earlier estimate,95ohof individualswho are 60 inchestall will weigh between59.063(91.359= + pounds. (91.359 32.296 123.655) 32.296: 59.063) and 123.655

47

Chapter5 Prediction and Association

Drawing Conclusions indicate(a) whether or not a significant preConclusionsfrom regression analyses diction equation was obtained,(b) the direction of the relationship,and (c) the equation itself. Phrasing Results That Are Significant In the exampleson pages46 and 47, we obtainedan R Squareof .649 and a regression equationof WEIGHT' : -234.681 + 5.434(HEIGHT). The ANOVA resultedin .F = 25.926 with I and 14 degreesof freedom.The F is significant at the less than .001 level. Thus, we could statethe following in a resultssection: A simple linear regressionwas calculatedpredicting participants' weight equationwas found (F(1,14) : basedon their height.A significantregression predictedweight is equal 25.926,p < .001),with an R' of .649.Participants' to -234.68 + 5.43 (HEIGHT) pounds when height is measuredin inches. Participants' weight increased 5.43poundsfor eachinch of height. average The conclusion statesthe direction (increase),strength(.649), value (25.926), deIn greesof freedom(1,14), and significancelevel (<.001) of the regression. addition,a statement the equation of itself is included. Phrasing ResultsThatAre Not Significant If the ANOVA is not significant (e.g.,seethe output at right), the section of the output labeled SE for the ANOVA will be greaterthan .05, and the regression equationis not significant.A results section might include the following statement: A simple linear regression was calculatedpredicting participants' ACT scoresbasedon their height. The regressionequation was not : significant(F(^1,14) 4.12, p > .05) with an R' of .227.Height is not a significantpredictor of ACT scores.
llorlol Srrrrrrry Adjuslsd R Souare

Std.Eror of
lh. Fsl i m a l e

Hodel

attt

R Souare 221

112

3 06696

a. Predlclors: (Constan0,h8lghl

rt{)vP
Sumof
Xodel dl

xean Souare
I

t
4.12U

Rssldual Tolal

J U/?U r 31 688 170t38

Slo 0621

1a t5

I 408

(Conslan0.h8lghl a. Prodlclors: b. OependentVarlabler acl

Cootlklqrrr Unstandardizd
Hodl

Slandardizsd

ts

Std. Erol

Bsta
J OJI

Sio

(u0nslan0 hei9hl

| 9.35I -.411

13590 203

. r 17

.2 0 3 0

003 062

a. OBDendsnlva.iable: acl

Note that for resultsthat are not significant,the ANOVA results and R2resultsare given,but the regression equation not. is Practice Exercise Use PracticeData Set 2 in Appendix B. If we want to predict salary from years of education,what salary would you predict for someonewith l2 years of education?What with a collegeeducation(16 years)? salarywould you predict for someone

48

Chapter5 Prediction and Association

Section 5.4 Multiple Linear Regression Description


The multiple linear regression analysisallows the prediction of one variable from severalother variables.

Assumptions
Multiple linear regression that all variables assumes are interval- or ratio-scaled. In addition, the dependent variable should be normally distributedaround the prediction line. This, of course,assumes that the variablesare relatedto eachother linearly. All variablesshouldbe normally distributed. Dichotomousvariables are also acceptable indeas pendentvariables. ,SP,S,S Data Format At least three variablesare required in the SPSSdata file. Each subject must contributeto all values.

Running the Command Click Analyze, thenRegression, Linear. At"h* eoptrc utiltt 5 then I This will bring up the main dialog box for Linear Regression. the left sideof the dialog box is a i &ry!$$sruruct On list of the variables your datafile (we are using Cglpsaftladls in GarnrdLhcar ldd the HEIGHT.savdata file from the start of this chapter). the right side of the dialog box are On you blanks thedependent for variable(thevariable aretrying to predict) variables andthe independent (thevariables from whichyou arepredicting).
Dmmd*

t{,lrdq., }l+

l-...G

LLI l&-*rt

.roj I

ryl

n{.rI tb.l

fn f*---*--

SlcdirnVdir*

I ,it'r:,

Er'---

Cs Lrbr&:

ti4svlit{

Li-J rsr"u*t. I Pr,rr... s* I | oei*. I

We are interested in predicting someone's weight basedon his or her height sex. We believe that both sex and and height influence weight. Thus, we should place the dependent variable WEIGHT in the Dependentblock and the independent variables HEIGHT and SEX in the Independent(s) block. Enterboth in Block l. This will perform an analysisto determine if WEIGHT can be predicted from SEX and/or HEIGHT. There are several methods SPSS can use to conduct this analysis. These can be selected with the Method box. Method Enter. the most widely

49

Cha p te 5 Pre d i c ti o n n dA s s o c i a ti on r a

used,puts all variables the equation, whether they are significant or not. The other in methodsuse various meansto enter only those variables that are significant predictors. Click OK to run the analvsis.

Uethod lE,rt-rl Readingthe Output For multiplelinearregression,therearethreecomponents of the output in which we are interested. The first is called Model the Summary, which is foundafterthe
ModelSummary Model R
R Souare

.99 .993 a. Predictors: (Constant), height sex,

Adjusted R Square .992

Std.Errorof the Estimate 2.29571

VariablesEntered/Removed section.For our example,you should get the output above.R Square(calledthe coefficientof determination) tells you the proportionof the variance in the dependentvariable (WEIGHT) that can be explained variationin the independby ent variables (HEIGHT and SEX, in this case). Thus, 99.3%of the variationin weight can be explained by differencesin height and sex (taller individuals weigh more, and men weigh more). Note that when a secondvariable is added,our R Squaregoes up from .649 to .993.The .649was obtained usingthe SimpleLinear Regression examplein Section5.3. The StandardError of the Estimategives you a margin of error for the prediction equation. Using the predictionequation,68%o the datawill fall within one standard erof ror of estimate (predicted) value.Just over 95% will fall within two standard errors of estimates.Thus, in the exampleabove,95ohof the time, our estimatedweight will be within 4.591 (2.296 x 2) poundsof being correct.In our Simple Linear Regression example in Section5.3, this numberwas 32.296. Note the higherdegree accuracy. of The secondpart of the output that we are interested is the ANOVA summarytain ble. For more information on readingANOVA tables,refer to the sectionson ANOVA in Chapter6. For now, the importantnumberis the significancein the rightmostcolumn. If that value is lessthan .05,we havea significantlinearregression. it is largerthan .05,we If do not.
eHoveb
Model xegresslon Residual Total Sum of Souares
df

03424 24
68.514 10410.938

z 13
15

Mean Souare 5171.212 5.270

F
v61.ZUZ

S i o.

.0000

a. Predictors: (Constant), sex, height b. Dependent Variable: weight

The final sectionof output we are interested is the table of coefficients.This is in wherethe actualpredictionequationcan be found.

50

Chapter5 Predictionand Association

Coefficientf Unstandardized Coefficients Model 1 (Constant) height sex Standardized


Coefficients

B Std.Error 14.843 47j38 .198 2 .1 0 1 1.501 -3 9 .1 3 3

Beta

t
176

Sio.

.312 -.767

10.588 -26.071

.007 .000 .000

a. Dependent Variable: weight

equation.For multiple reIn most texts, you learn that Y' = a + bX is the regression + gression,our equationchanges l" = Bs + B1X1 BzXz+ ... + B.X.(where z is the number to Variables). I/' is your dependent variable, and the Xs are your independof Independent ent variables. The Bs are listed in a column. Thus, our predictionequationfor the example + (whereSEX is codedas aboveis WEIGHT' :47.138 - 39.133(SEX) 2.101(HEIGHT) I : Male, 2 = Female,and HEIGHT is in inches).In other words, the averagedifferencein weight for participants who differ by one inch in height is 2.101 pounds.Males tend to weigh 39.133 pounds more than females.A female who is 60 inchestall should weigh + of 47.138- 39.133(2) 2.101(60):94.932 pounds.Given our earlierdiscussion the stan90.341 who are 60 inchestall will weigh between of females dard error of estimate ,95o/o (94.932+ 4.591= 99.523)pounds. (94.932- 4.591: 90.341)and99.523 Drawing Conclusions analysesindicate(a) whether or not a significant preConclusionsfrom regression (b) the direction of the relationship,and (c) the equation diction equation was obtained, itself. Multiple regressionis generallymuch more powerful than simple linear regression. Compareour two examples. you the With multiple regression, must alsoconsider significancelevel of eachindependentvariable. In the exampleabove,the significancelevel of both independent variables is lessthan .001.
Morbl Sratrtny

PhrasingResults ThatAre Significant


In our example, obtained we an equaR Square of.993 anda regression tion of WEIGHT' = 47.138 + ). 39 . 1 3 3 ( S E X ) 2.1 0 1 (H E IGH TT he with 2 ANOVA resulted F: 981.202 in F and 13 degrees freedom. is signifiof we cantat the lessthan.001level.Thus. seccouldstate followinein a results the tion:

xodsl

R Souars

.997. a Prsdictorsr (Conslan0, hsighl sex,

Adlusted Std.Eror of R Souare lhe Estimatg 992 2 2C 5r 1

ANr:rVAD

Sumof
Sd r r r r a q Xodel dt Heorsssron r u3t2.424 I 2 R es i dual 68.5t 4 Tutal | 041 0.938 15 a. Predlctors: (Conslan0, hoighl ser,

XeanSouare 5171 212

981202

000r

b. OspBndontVariablo reighl

Coefllcldasr

Unslanda.dizsd
Xodel

Slandardizad

Std.Eror
4 843

Beta .312

Sio t6

at 38 hei0hl 2.1 01 .39.1 sex 33 a. Depsndenl Varlabl: rei0hl

.1 98 L501

10.588 - 26.071

007 000 000

5l

Chapter5 Prediction and Association

A multiple linear regressionwas calculatedto predict participants' weight basedon their height and sex. A significantregression equationwas found (F(2,13): 981.202, < .001),with an R' of .993. Participants' p predicted weightis equalto 47.138- 39.133(SEX) 2.10l(HEIGHT), + whereSEX is coded as I = Male, 2 : Female, and HEIGHT is measuredin inches. Participantsincreased2.101 pounds for each inch of height, and males weighed 39.133 pounds more than females. Both sex and height were significantpredictors. The conclusionstates the direction(increase), strength(.993), value (981.20),degreesof freedom(2,13),and significancelevel (< .001) of the regression. addition,a In statement the equationitself is included.Because of there are multiple independent variables,we havenotedwhetheror not eachis significant. Phrasing Results That Are Not Significant If the ANOVA does not find a significantrelationship, Srg section the of the output will be greaterthan .05, and the regressionequation is not significant. A resultssectionfor the output at right might include the following statement: A multiple linear regression was predicting particicalculated pants' ACT scores basedon their height and sex. The regression equation was not significant : (F(2,13) 2.511, > .05)with an p R" of .279. Neither height nor weight is a significant predictor of lC7" scores.
llorlel Surrrwy
XodBl x R Souare

AdtuslBd R Souare

Std Eror of 3 07525

528. t68 (ConslanD. hel9ht a Prsdlclors: se4

ANI]VIP gum of
dt
qin

Reoressron Rssidual Total

1t.191 122.9a1 't70.t38

l3 't5

23.717 9.a57

I 2.5r

i tn.

(ConslanD, hsight a Prdictors: se( o. OoDendBnl Vaiablor acl

Coetllc lst 3r Unstandardizsd Cosilcisnls


Yo d e l

Standardized Coeilcionts
Beia
std

Sld E.rol

(Constan0 hl9hl sx

oJ ttl - 576
-t o??

19.88{ .266 2011

- .668 - 296

3.1 02 2.1 68 - s 62

007 019 35{

given, theregression but equation not. is "o, Practice Exercise

Notethatfor results are that

,ir";;;;ilJlovA

results R2 and results are

UsePractice DataSet2 in Appendix Determine prediction for B. the equation predictingsalary based education, on years service, sex.Whichvariables significant are of and predictors? you believethat men were paid more than womenwere,what would you If conclude afterconducting analysis? this

52

Chapter 6

Parametric InferentialStatistics
Parametric statistical procedures allow you to draw inferences about populations basedon samplesof those populations. make theseinferences, To you must be able to makecertainassumptions aboutthe shape the distributions the populationsamples. of of

Section 6.1 Reviewof BasicHypothesis Testing TheNull Hypothesis


In hypothesis testing,we createtwo hypotheses that are mutually exclusive(i.e., both cannotbe true at the sametime) and all inclusive(i.e.,one of them must be true).We refer to thosetwo hypotheses the null hypothesisand the alternative hypothesis.The as null hypothesis generallystatesthat any differencewe observeis causedby random error. The alternative hypothesis generallystatesthat any differencewe observeis causedby a systematic differencebetweengroups.

TypeI and TypeII Eruors

REAL WORLD
Null Hypothesis True

NullHypothesis False All hypothesistesting attemptsto draw conclusions about the real world basedon the resultsof a test (a statistical test, in this case).There are four possible zdi TypeI Error I No Error combinationsof results (see the figure at <.r) 6a right). = ETwo of the possible results corare A rect test results.The other two resultsare enors. A Type I error occurs when we U ; - ^6 reject a null hypothesis that is, in fact, 6 fr trO! u true, while a Type II error occurs when l- o> No Error I Typell Error we fail to reject the null hypothesis that 'F: n2 is, in fact,false. Significance tests determine the probabilityof making a Type I error. In other words, after performing a seriesof calculations, obtain a probability that the null we hypothesisis true. If there is a low probability,suchas 5 or less in 100 (.05), by convention, we rejectthe null hypothesis.In other words,we typically use the .05 level (or less) as the maximumType I error ratewe are willing to accept. When there is a low probability of a Type I error, such as .05, we can statethat the significancetest has led us to "reject the null hypothesis."This is synonymous with saying that a difference is "statistically significant." For example,on a reading tesr, suppose you found that a random sampleof girls from a school district scoredhigher than a random

53

Chapter6 ParametricInferentialStatistics

sampleof boys. This result may have been obtainedmerely because chanceenors asthe sociatedwith random sampling createdthe observeddifference (this is what the null hypothesis asserts).If there is a sufficiently low probability that random errors were the cause(as determinedby a significance test),we can statethat the differencebetweenboys and girls is statistically significant.

Significance Levelsvs.Critical Values Moststatistics textbooks present hypothesis testing using concept a critiby the of
cal value. With such an approach,we obtain a value for a test statisticand compareit to a critical value we look up in a table.If the obtainedvalue is larger than the critical value, we reject the null hypothesis and concludethat we have found a significant difference(or relationship).If the obtainedvalue is lessthan the critical value, we fail to reject the null hypothesisand concludethat there is not a significantdifference. The critical-value approachis well suited to hand calculations.Tables that give critical valuesfor alpha levels of .001, .01, .05, etc., can be created. is not practicalto tt createa table for every possiblealpha level. On the other hand, SPSScan determinethe exact alpha level associated with any value of a test statistic. Thus, looking up a critical value in a table is not necessary. This, however,doeschangethe basic procedurefor determiningwhetheror not to reject the null hypothesis. The sectionof SPSSoutput labeledSrg.(sometimes or alpha) indicatesthe likelip hood of making a Type I error if we rejectthe null hypothesis. value of .05 or lessinA dicatesthat we should reject the null hypothesis(assumingan alpha level of .05). A value greaterthan .05 indicatesthat we shouldfail to reject the null hypothesis. In other words, when using SPSS,we normally reject the null hypothesis if the output value under Srg. is equal to or smaller than .05, and we fail to reject the null hypothesisif the outputvalueis largerthan .05. One-Tailed vs. Two-Tailed Tests SPSSoutput generally includes a two-tailed alpha level (normally labeled Srg. in the output). A two-tailed hypothesisattemptsto determinewhether any difference (either positive or negative)exists.Thus, you have an opportunity to make a Type I error on either of the two tails of the normal distribution. A one-tailedtest examinesa differencein a specificdirection.Thus, we can make a Type I error on only one side (tail) of the distribution.If we have a one-tailedhypothesis, but our SPSSoutput gives a two-tailed significance result, we can take the significance level in the output and divide it by two. Thus, if our differenceis in the right direction,and if our output indicatesa significance level of .084 (two-tailed), but we have a one-tailed hypothesis, can report a significance level of .042 (one-tailed). we Phrasing Results Resultsof hypothesistestingcan be statedin different ways, dependingon the conventionsspecifiedby your institution.The following examplesillustrate someof thesedifferences.

54

Chapter6 ParametricInferentialStatistics

Degreesof Freedom immediately after the Sometimesthe degreesof freedom are given in parentheses symbol representing test,as in this example: the (3):7.0 0 ,p<.0 1 of Other times, the degrees freedomare given within the statement results,as in of this example: t:7.0 0 , df :3, p < .01 SignificanceLevel When you obtain results that are significant, they can be describedin different ways. For example,if you obtaineda significancelevel of .006 on a t test, you could describeit in any of the following threeways: (3 ):7 .00,p <.05 (3 ):7 .00,p <.01 (3 ) : 7.0 0 , : .006 p Notice that because exactprobabilityis .006,both .05 and .01 are alsocorrect. the There are also variousways of describingresultsthat are not significant.For example, if you obtaineda significancelevel of .505, any of the following three statementscould be used: t(2 ): .8 0 5 , ns p > .05 t(2):.8 0 5 , t(2 )=.8 0 5 ,p:.505 Statementof Results the Sometimes resultswill be statedin terms of the null hypothesis,as in the following example: The null hypothesis rejected 7.00,df :3, p:.006). was 1t: Other times,the resultsare statedin terms of their level of significance,as in the following example: was A statistically significant difference found:r(3):7.00,p <.01. StatisticalSymbols in use of Generally,statisticalsymbolsare presented italics. Prior to the widespread computersand desktop publishing, statisticalsymbols were underlined.Underlining is a signal to a printer that the underlinedtext should be set in italics. Institutions vary on their requirementsfor student work, so you are advised to consult your instructoraboutthis. Section 6.2 Single-Sample I Test Description the The single-sample test compares mean of a single sampleto a known populaI if the current set of data has changedfrom a longtion mean. It is useful for determining
I

55

Chapter6 ParametricInferentialStatistics

term value (e.g., comparing the curent year's temperatures a historical averageto deto termine if global wanning is occuning).

Assumptions
The distributions from which the scoresare taken should be normally distributed. However, the t test is robust and can handleviolations of the assumptionof a normal distribution. The dependentvariable must be measured an interval or ratio scale. on .SP,SS Data Format The SPSSdata file for the single-sample test requiresa single variablein SPSS. / That variablerepresents set of scoresin the samplethat we will compareto the populathe tion mean. Running the Command The single-sample test is locatedin the CompareMeans submenu,under theAna/ lyze menu.The dialog box for the single-sample test requiresthat we transferthe variable / representing currentset ofscores ry the wllties Wiidnru nrb to the Test Variable(s) section. We i Ardvze Graphs ) Reports must also enter the population averI Sescriplive Stati*icr ] age in the Test Value blank. The exMcrns.,, ample presentedhere is testing the General linearModel ) variable LENGTH againsta popula- ' ) CorrElata Indegrdant-SrmplCs T tion meanof 35 (this exampleusesa ' Regrossion ) Paired-Samples T Tert.,, hypotheticaldata set). ) Clasdfy 0ne-Wey ANOVA,..

Readingthe Output
The output for the single-sample test consistsof two sections.The first section t lists the samplevariable and somebasic descriptive statistics (N, mean, standard deviation, and standard error).

56

Chapter6 ParametricInferentialStatistics

T-Test
S'tatistics One-Sample
N LENGTH

Mean 10

Std. Enor std. Deviation Mean 1.1972 .3786

35.9000

Tes{ One-Sample = TeslValue 35 95%gsnl i 6sngg Interual the of Difierence Mean sis. (2-tailed) DitTerence Lowef U pner 7564 .041 .9000 4.356E -02

df

LENGTH

2.377

The secondsection of output containsthe results of the t test. The example preof sentedhere indicatesa / value of 2.377,with 9 degrees freedomand a significance level betweenthe sampleaverage(35.90) of .041.The mean differenceof .9000is the difference and the populationaveragewe enteredin the dialog box to conductthe test (35.00). Drawing Conclusions The I test assumesan equality of means.Therefore, a significant result indicates that the sample mean is not equivalentto the population mean (hence the term "significantly different"). A result that is not significantmeansthat there is not a significantdifference betweenthe means.It doesnot mean that they are equal. Refer to your statisticstext for the sectionon failure to reject the null hypothesis.

PhrasingResults ThatAre Significant


The aboveexamplefound a significantdifferencebetweenthe populationmean and the samplemean.Thus, we could statethe following: A single-samplet test compared the mean length of the sample to a populationvalue of 35.00. A significantdifferencewas found (t(9) = 2.377, greater was significantly p <.05).The sample meanof 35.90(sd: 1.197) than the population mean.

57

Chapter6 ParametricInferentialStatistics

Phrasing Results ThatAre Not Significant


If the significance level had been greaterthan .05, the differencewould not be significant. For example,if we receivedthe output presentedhere, we could statethe following:
L{n-S.I|pla Sl.llsllcs Std.Eror
N

tean I
OU .OOD T

Sld Deviation

tempetalure

1 9.1 013

3.03681

L'|D.S.Ittte

Tesi

A single-sample / test 95% Contldsnc6 Inlsryal ofthe comparedthe mean tempOit8rsnce llean t dt Sro (2-laild) DilTernce Lowsr UDOEI erature over the past year lemplatute 1 ?6667 I 688 8.2696 - 5.736 2 to the long-term average. The difference was not significant (r(8) = .417, p > .05). The mean temperature over the past year was 68.67(sd = 9.1l) comparedto the longterm average 67.4. of Practice Exercise The averagesalary in the U.S. is $25,000.Determine if the averagesalary of the participants PracticeData Set 2 (Appendix B) is significantlygreaterthan this value. in Note that this is a one-tailed hypothesis.

= TsstValue 57.1

Section 6.3 Independent-samples I Test Description


The independent-samplestest compares meansof two samples.The two samI the ples are normally from randomly assigned groups. Assumptions The two groupsbeing comparedshouldbe independent eachother. Observations of are independentwhen information about one is unrelated to the other. Normally, this meansthat one group of participantsprovidesdata for one sampleand a different group of participants provides data for the other sample (and individuals in one group are not matchedwith individualsin the other group).One way to accomplish this is throughrandom assignmentto form two groups. The scoresshould be normally distributed,but the I test is robust and can handle violationsof the assumption a normal distribution. of The dependent variable must be measured an interval or ratio scale.The inon dependentvariable shouldhaveonly two discretelevels. SP,S,S Data Format The SPSSdata file for the independent test requirestwo variables.One variable, I grouping variable, represents value of the independentvariable. The grouping the the variable should have two distinct values (e.g., 0 for a control group and I for an experi-

58

Chapter6 ParametricInferentialStatistics

the mental group). The secondvariablerepresents dependent variable, such as scoreson a test. Conducting an Independent-Samples t Test For our example,we will use 6adrs Utltties window Heh [n;y* the SAMPLE.savdatafile. Click Analyze, then Compare Std{ic* l Means, then Independent-SamplesT ?"est. This will bring up the main diaI log box. Transfer the dependent ) Can*bto variable(s) into the Test Variable(s) l Regf6s$isl t ctaseff blank. For our example,we will use

ls elrl

thevariable GRADL,. For section. our exvariableinto the GroupingVariable Transfer independent the ample, will usethevariable we MORNING.
Next, click Define Groups and enter the values of the two levels of the independent variable. Independent t tests are capable of comparingonly two levels at a time. Click Continue, then click OK to run the analysis.

it dry lim mk lr*iT

'1

Outputfrom the Independent-Samples t Test The output will have a sectionlabeled"Group Statistics."This sectionprovidesthe basicdescriptivestatisticsfor the dependentvariable(s)for eachvalue of the independent variable. It shouldlook like the outputbelow.
Group Statistics mornrnq grade No Yes
N

2 2

Std.Deviation Mean 82.5000 3.53553 78.0000 7.07107

Std.Error Mean 2.50000 5.00000

59

Chapter Parametric 6 Inferential Statistics

Next, there will be a sectionwith the resultsof the I test. It should look like the outputbelow.
lnd6pondent Samplo! Telt Leveng's Tost for
Fdlralitv df Variencac

ltest for Eoualitv of Msans 95% Confidenco lntsrual of the Diffornc6 Lower

Sio . graoe tqual vanances assum6d Equal variances not assumed

df

Sio. (2{ailed}

Mean Diff6rence

Sld. Error Ditforenc6

Uooer

.805 .805

2 1.471

505 530

4.50000 4.50000

5.59017 -19.55256 28.55256 5.59017 -30.09261 39.09261

The columns labeledt, df, and Sig.(2-tailed) provide the standard"answer" for the I test.They provide the value of t, the degrees freedom(number of participants,minus 2, of in this case), and the significancelevel (oftencalledp). Normally, we usethe "Equal variances assumed" row. Drawing Conclusions Recall from the previous section that the , test assumesan equality of means. Therefore,a significant result indicatesthat the means are not equivalent.When drawing conclusions abouta I test,you must statethe directionof the difference(i.e., which mean was larger than the other). You should also include information about the value of t, the degrees freedom, significancelevel,and the meansand standard deviationsfor the of the two groups. Phrasing Results That Are Significant For a significant / test (for example,the output below), you might statethe following:
0r0uD conlfol rDoflmontal
N

6adO slil|llk!

'.do,.ilhrh!
sc0r9 xan al 0000
Sld. Odiation

a.7a25a

2osr67

Sld Eror xaan 2 121J2 | 20185

hlapdrlra Lmno's Tsl for

S,uplcr lcsl

l l a sl fo r Fo u e l l to o t l n a n s

95(f, Contld9n.e Inlryalofthe

xsan
Si d

Std Error 2 70391 2 138't 2


71605 I 20060

dT 071 5

Sio a2-laile(n
U Jb

sc0re

EqualvarlencSs assumd Equalvaiancos nol assumgd

5 058

7 66667

I r. 6 1 2 8 7 r r. r 3 2 7 3

31tl

a 53l

029

An independent-samples test comparing the mean scores of the I experimental and control groups found a significant difference between the means of the two groups ((5) = 2.835, p < .05). The mean of the experimental group was significantlylower (m = 33.333,sd:2.08) than the meanof the control group (m: 41.000, : 4.24). sd

60

Chapter6 ParametricInferentialStatistics

Phrasing Results That Are Not Signifcant In our example at the start of the section,we comparedthe scoresof the morning peopleto the scoresof the nonmorningpeople.We did not find a significant difference,so we could statethe following: t test was calculatedcomparingthe mean score of An independent-samples as participants who identified themselves morning people to the mean score of participants who did not identi$ themselvesas morning people. No significant difference was found (t(2) : .805, p > .05). The mean of the morning people (m : 78.00,sd = 7.07) was not significantly different from the meanof nonmomingpeople(m = 82.50,sd = 3.54). Practice Exercise Use Practice Data Set I (AppendixB) to solvethis problem.We believethat young skills than older individuals.We would test this hyindividualshave lower mathematics pothesisby comparingparticipants25 or younger(the "young" group) with participants26 or older (the "old" group). Hint: You may need to createa new variable that represents which age group they are in. SeeChapter2 for help.

/ Test Section 6.4 Paired-Samples Description


The paired-samples test (also called a dependent/ test) comparesthe means of I For example,comparinga pretestand a posttestscorefor two scoresfrom relatedsamples. I would requirea paired-samples test. a group of participants Assumptions that both variablesare at the interval or ratio The paired-samples test assumes I levels and are normally distributed.The two variablesshould also be measuredwith the samescale.If the scalesare different. the scoresshouldbe convertedto z-scoresbefore the , testis conducted. .SPSS Data Format Two variablesin the SPSSdata file are required.These variablesshould represent two measurements from eachparticipant. Running the Command We will createa new data file containing five variables:PRETEST, MIDTERM, three different instructors FINAL, INSTRUCT, and REQUIRED. INSTRUCT represents whether the coursewas required or was an elective for a course.REQUIRED represents (0 - elective, I : required).The other threevariablesrepresent exam scores(100 being the possible). highestscore

6l

Chapter6 ParametricInferentialStatistics

PRETEST 56 79 68 59 64
t.+ t)

47 78 6l 68 64 53 7l 6l 57 49 7l 6l 58 58

MIDTERM 64 9l 77 69 77 88 85 64 98 77 86 77 67 85 79 77 65 93 83 75 74

FINAL 69 89 8l 7l
IJ

INSTRUCT I I I I I 2 2 2 2 2 2 2
J J J J J t J J

REQUIRED 0 0 0

86 86 69 100 85 93 87 76 95 97 89 83 100 94 92 92

0 0 0

0 0 0

Enter the data and save it as You can check your GRADES.sav. data entry by computinga mean for each instructor using the Means command (see Chapter 3 for more information). Use INSTRUCT as the independentvariable and enter PRETEST, MIDTERM, ANd FINAL as your dependent variables. Once you have entered the I data,conducta paired-samples test comparing pretest scores and final scores. Click Analyze, then Compare Means, then Paired-SamplesT ?nesr. This will bring up the main dialog box.
i;vr*;
'

Report INSTRUCT
1.00 Mean N

PRETEST MIDTERM 67.5714 (6.t'l 4J 7 7 8.3837 63.1429 7 10.6055 59.2857 7 6.5502 63.3333 21 8.9294 9.9451 79.1429 7 11.7108 78.0000 7 8.62',t7 78.6190 21 9.6617

FINAL

79.5714 7 7.9552 86.4286 7 10. 9218 92.4286 7 5.5032 86. 1429 21 9.6348

std. Deviation
2.00 Mean N std. Deviation Mean N std. Deviation Mean N std. Deviation

3.00

Total

6r5db trl*lct
l t
t

tilkrdo|

tbb

Bsi*i l)06ctftlv.s44q

@ Cdnrdur*Modd
i c*@ i lryt'sn i'w

q8-sdndo T Tart...

62

Chapter6 Parametric InferentialStatistics

You must select pairs of variables to compare. As you select them, they are placed in the Current Selections area. Click once on PRETEST. then once on FINAL. Both variableswill be moved into the Cnrrent Selections area.Click on the right anow to transfer the pair to the Paired Variablessection.Click OK to conductthe test.

m
ry{l .ry1
: CundSdacti:n: I VcirHal: Vri{!ilcz

|y:tl

"l*;"J
xl
PaildVaiibblt

y.pllarq Cf;tndcimn *,n*f.n

.!K I

ril',n*
atrirdftEl d1;llqi.d

"ttr.I
- .t L*rd

fc

Reading the Output The output for PairedSamples Statistics paired-samplestest the t std. Std.Error consists of three comMean N Deviation Mean ponents. The first part Pair PRETEST 63.3333 21 8.9294 1.9485 givesyou basicdescrip1 rtNet 86.1429 21 9.6348 2.1025 tive statistics for the pair of variables. The PRETESTaverage was 63.3,with a standard deviation of 8.93.The FINAL average was 86.14, with a standarddeviationof 9.63.
PairedSamplesGorrelations
N PaIT1 PRETEST & F IN A L

Correlation 21 .535

S i o.

.013

The second part ofthe output is a Pearson correlation coefficientfor the pair of variables.

Within the third part of the output (on the next page),the sectioncalled paired Differences contains information about the differencesbetweenthe two variables.you miy have learnedin your statistics classthat the paired-samples test is essentially single/ a samplet testcalculated the differences on between scores. the The final threecolumnscontain the value of /, the degreesof freedom,and the probability level. In the examplepresentedhere,we obtaineda I of -11.646,with 20 degrees freedomand a significance of level of lessthan.00l. Note that this is a two-tailedsignificancelevel. Seethe startof this chapter more detailson computing one-tailed for a test.

63

Chapter6 ParametricInferentialStatistics

Paired Samples Test

Paired Differences

95%Confidence Inlerval the of

srd,
Mean
Dcvirl

Std.Errot
Maan

l)iffcrcncc

sig
U D oer
I

Lower

df

l)Jailatl\

Pair1

PRETEST -22.8095 . FINAL

8.9756

1.9586 -26.8952 -18.7239

'|1.646

20

.000

Drawing Conclusions Paired-samples testsdeterminewhetheror not two scoresare significantly differI ent from each other. Significant values indicate that the two scoresare different. Values that are not significant indicatethat the scoresare not significantly different.

PhrasingResultsThatAre Significant
When statingthe resultsof a paired-samplestest,you shouldgive the value of t, I the degrees freedom,and the significancelevel. You should also give the mean and of standard deviation for each variable.as well as a statement results that indicates of whetheryou conducted one- or two-tailedtest.Our exampleabovewas significant, we a so could statethe following: A paired-samples test was calculated comparethe mean pretestscoreto I to the mean final exam score.The mean on the pretestwas 63.33 (sd : 8.93), and the mean on the posttest was 86.14(sd:9.63). A significantincrease from pretestto final was found (t(20) : -ll .646,p < .001). Phrasing Results That Are Not Significant If the significancelevel had beengreater than .05 (or greaterthan .10 if you were conducting one-tailed a test),the resultwould not have beensignificant. For example,the hypotheticaloutput below represents nonsignificantdifference.For this output, we could a state:
P.l crl Sdr{ror St.rtlstk r
Sld.Etrof x6an Patr 1 midlsrm nnal
N Sl d D d a l l o n

r B t1a3 79.571 1

7 I

I 9.509 7 95523

3 75889 3.00680

P.rl c(l Sxtlro3


N

Cq I olrlqt6 Cotrelali 96S


5 to

ts4trI

mtdtrm&flnal

000

P.*od Silrlta! ParredOrflerences

Tcrl

x s an
. 8571a

Std Oryiation

Std.Eror Xan

95% ConidBncs lnlsNrlotth DilYerenca u008r Lmt


1 88788 - 76a

tl
b

2 96808

I 12183

Sio (2-lailad) a7a

64

Chapter6 ParametricInferentialStatistics

to the A paired-samples wascalculated compare meanmidtermscoreto t test The on was78.71(sd: 9.95), themeanfinal examscore. mean the midterm (sd:7.96). No significant difference andthe meanon the final was79.57 = -.764,p >.05). from midterm finalwasfound((6) to Practice Exercise data Usethe same GRADES.sav file, andcompute paired-samples to detera / test mineif scores from midterm final. to increased Section6.5 One-Wav ANOVA Description (ANOVA) is a procedure determines proportion Analysisof variance that the of variabilityattributed eachof several components. is oneof the mostusefulandadaptIt to ablestatistical techniques available. the of The one-way ANOVA compares means two or moregroupsof participants that vary on a singleindependent variable (thus,the one-waydesignation). When we groups, we havethreegroups, couldusea I testto determine we differences between but wouldhaveto conduct to threet tests(GroupI compared Group2, Group I compared to Whenwe conduct multipleI tests, inflate we Group3, andGroup2 compared Group3). to of the Type I error rate and increase chance drawingan inappropriate our conclusion. ANOVA compensates thesemultiplecomparisons givesus a singleanswerthat and for from anyof theothergroups. tellsus if anyof thegroups different is Assumptions The one-wayANOVA requires singledependentvariable and a singleindea pendentvariable. Which groupparticipants belongto is determined the valueof the by independentvariable. Groupsshouldbe independent eachother.If our participants of belong to more than one group each,we will have to conducta repeated-measures variable,we would conduct factorial ANOVA. If we havemorethanone independent a ANOVA. variableis at the intervalor ratio levANOVA alsoassumes thedependent that is normallydistributed. elsand SP,S,S Data Format datafile. One variableserves the deTwo variables required the SPSS in as are pendentvariable andtheotherasthe independent provariable.Eachparticipant should variable. videonly onescore thedependent for Running the Command we data in For this example, will usetheGRADES.sav file we created the previous section.

65

Chapter6 ParametricInferentialStatistics

To conduct a one-way ANOVA, click Analyze, then Compare Means, then One-I7ay ANOVA. This will bring up the main dialog box for the Onelltay ANOVA command.
l ':l i rl :.,. :J xrded

i An$tea &ryhr l.{lith; wr$|f ) R;pa*tr I DrrsFfivsft#l|'


lhcd friodd ]

$alp

orn-5dtFh

f f64...

Csnalda
R!E'l'd6al

I )

l|1|J46tddg.5arylarT16t,., P*G+tunpbs I IcJt..,

cl#y

rffillEil

f *wm dtin"t
S instt.,ct / rcqulad

f,'',I
R*g I c*"d I

miqq|4^__
!qt5,l

Hdp I

fqiryt,,,l odiors"' I

You should place the independent variable in the Factor box. For our example, INSTRUCT representsthree different instructors.and it will be used as our independentvariable. Our dependentvariable will be FINAL. This test will allow us to determine if the instructor has any effect on final sradesin the course.

pela$t

/n*Jterm d 'cqfuda

-rl ToKl
f- Fncdrdr*dcndlscls

Ro"t I

l* Honroga&olvairre,* j *_It

mrffi,.*Click on the Options box to get the Options dialog box. Click Descriptive. This will give you meansfor the dependentvariable at each level of the independentvariable. Checkingthis box preventsus from having to run a separate means command. Click Continue to return to the main dialog box. Next, click Post Hoc to bring up the Post Hoc Multiple Comparisonsdialog box. Click Tukev.then Continue.

c*ll H& l

Pqyg*- odi"*.'. I I "?s'qt,.,l

, Eqiolvdirnccs NotfudfiEd

r TY's,T2

l,?."*:,_:

t or*ol:: _l6{''6:+l.'od

Sigflber!

h,v!t tffi-

I rc"'*i.' r"ry{-l H* | "

Post-hoctests are necessary the event of a significant ANOVA. The ANOVA in only indicatesif any group is different from any other group. If it is significant,we needto determinewhich groupsare different from which other groups.We could do I teststo determine that, but we would have the sameproblem as before with inflating the Type I error rate. There are a variety of post-hoccomparisons that correct for the multiple comparisons.The most widely used is Tukey's HSD. SPSSwill calculatea variety of post-hoc testsfor you. Consult an advanced of text for a discussion the differencesbetween statistics thesevarioustests.Now click OK to run the analvsis.

66

InferentialStatistics Chapter6 Parametric

Readingthe Output (i.e.,levelof the independinstructor will Descriptive statistics be givenfor each final Instructor hadan average I in class, ent variable)andthe total.For example, hisftrer exam score 79.57. of
Descriptives
final

N
UU

2.00 3.00 Total

7 7 2'l

95% Confidence Interval for Mean Std. Deviation Std.Enor LowerBound UooerBound Minimum Maximum Mean 72.2',t41 6Y.UU tiv.uu 86 9288 7.95523 3.00680 79.5714 100.00 69.00 76.3276 96.5296 1 0 .9 2180 4.12805 86.4286 100.00 83.00 87.3389 97.5182 5.50325 2.08003 92.4286 100.00 69.00 81.7572 90.5285 9.63476 2.10248 86. 1429

The next section of the output is the ANOVA sourcetable. This is where 579.429 the various componentsof 1277. 143 the variance have been 1856.571 listed,along with their relative sizes. For a one-way ANOVA, there are two componentsto the variance: Between due to our independent variable) and Within the Groups (which represents differences within eachlevel of our independentvariable). For differences Groups(which represents differencesdue to different instrucour example,the BetweenGroups variancerepresents individual differencesin students. represents tors. The Within Groupsvariance variThe primary answeris F. F is a ratio of explainedvariance to unexplained ance. Consult a statisticstext for more informationon how it is determined.The F has two of different degrees freedom,one for BetweenGroups(in this case,2 is the number of levels of our independentvariable [3 - l]), and anotherfor Within Groups(18 is the number minusthe numberof levelsof our independentvariable [2] - 3]). of participants post-hoc The next part of the output consistsof the results of our Tukey's ^EISD comparison. of us This table presents with every possiblecombination levels of our independInstructor I comparedto Instructor 2. Next is Inent variable. The first row represents structor I compared to Multlple Comparlsons
Instructor 3. Next is InF,NAL variabre: Dooendenr
HS D 95% Confidence Mean (J) Difference ( t) Il-.tl INSTRUCT INSTRUCT 1 .0 0 2.OU .8571 Lower B ound
-16.J462

structor 2 compared to Instructor l. (Note that this is redundantwith the first row.) Next is Instructor 2 comparedto Instructor 3, and so on. The column labeled Sig. representsthe Type I error (p) rate for the simple (2-level) comparison in that row. In our

Upper
R6r rn.l

Sld. Enor 4.502

Sia
JU4

4.6339 -1.3 661 18.3 482 1 5,49 1 24.3482 17.49 11

3.00 2.00 3.00


1.00

-12.8571' 6.8571 -6.0000 12.8571' 6.0000

3.00
1.00

4.502 4.502 4.502


4.502 4.502

2.00

.027 .304 .396 .027 .396

-24.3482
-4.6339 1 -17.491 1.3661 -5.4911

'. The mean differnce significant the .05 level. is at

67

Chapter6 ParametricInferentialStatistics

example above, InstructorI is significantly 3, differentfrom Instructor but InstructorI is not significantly differentfrom Instructor andInstructor is not significantly 2 different 2, from Instructor 3.
Drawing Conclusions Drawing conclusionsfor ANOVA requiresthat we indicate the value of F, the degreesof freedom,and the significance level. A significantANOVA should be followed by the resultsof a post-hocanalysisand a verbal statement the results. of Phrasing Results That Are Significant In our exampleabove,we could statethe following: We computed a one-way ANOVA comparing the final exam scores of participantswho took a course from one of three different instructors.A significant difference was found among the instructors (F(2,18) : 4.08, p < .05). Tukey's f/SD was usedto determinethe natureof the differences between the instructors. This analysis revealed that students who had InstructorI scoredlower (m:79.57, sd:7.96) than students who had = 92.43, Instructor (m 3 who had Instructor (m:86.43, 2 sd: 5.50).Students sd : 10.92) were not significantly different from either of the other two groups. Phrasing Results That Are Not Significant If we had conducted the analysis using PRETEST as our dependent variable instead of FINAL, we would have received the following output: The ANOVA was not significant. so there is no need to refer to the Multiple Comparisons table. Given this result, we may statethe following:
Ocrcr lriir

95$ Conlldanc0 Inloml


N

tol

Sld Ddaton

Sld Eiror

LMf

Bo u n d

Uooer gound

200 300 Tolel

1 21

r 63.1 29 592857 633333

10.605r8 7 6.5501 I 92935

4.00819 2.at5t1 I gasa

5333r a 53 2278 592587

7? 95r 3 55 3a36 67 3979

1700 r9.00 at 0 0

78 00 71 00 79 00

All:'Vl

sumof
wrlhinOroups Tol.l

210.667 135a.000 t 594667

t8 20

120 333 15222

1 600

229

The pretest means of students who took a course from three different instructors were compared using a one-way ANOVA. No significant = difference was found (F'(2,18) 1.60,p >.05).The students from the three different classes not differ significantlyat the start of the term. Students did who had Instructor I had a mean scoreof 67.57 (sd : 8.38). Studentswho had Instructor2 had a mean scoreof 63.14(sd : 10.61).Studentswho had lnstructor3 had a meanscoreof 59.29(sd = 6.55).

68

Chapter6 ParametricInferentialStatistics

Practice Exercise if mathscores of DataSet I in AppendixB, determine the average UsingPractice of participants significantly are different.Write a statement single,married, and divorced results. Section6.6 Factorial ANOVA Description variThe factorialANOVA is one in which thereis morethan one independent variables,eachwith two levhas able.A 2 x 2 ANOVA, for example, two independent x 2 x 2 ANOVA hasthreeindependent variables.Onehasthreelevels,andthe els.A 3 ANOVA is very powerfulbecause allowsus to asit othertwo havetwo levels.Factorial variable,plustheeffects the interaction. of of sess effects eachindependent the Assumptions ANOVA (i.e.,the all of Factorial ANOVA requires of the assumptions one-way In at the interval or ratio levelsand normallydistributed). dependent variable mustbe be should independent each of other. variables addition, independent the Data Format .SPSS variable,andone variable each for for SPSS requires variable the dependent one variablethatis represented multias independent variable.If we haveanyindependent (e.g., PRETESTand POSTTEST), must use the repeated-measures we ple variables ANOVA. Runningthe Command This exampleuses the GRADES.sav Rnalyze Graphs USiUas llilindow Halp I Click Anadata file from earlierin this chapter. Repsrts lyze, then GeneralLinear Model. then UnivariDescrbtiwStatFHcs
ate.

R{'dsr! F.dclt}

?**" I -..fP:.J fi*"qrJ


!t,, I 9*ll

!!{4*,l

This will bring up the main dialog box for Univariate ANOVA. Select the dependent variable and place it in the DependentVariableblank (use FINAL for this example). Select one of your independent variables (INSTRUCT, in this case) and place it in the Fixed Factor(s) box. Place the secondindependent variable (REQUIRED) in the Fixed Factor(s)

6 Inferential Statistics Chapter Parametric

box. Having defined the analysis, now click Options. When the Options dialog box comes up, move INSTRUCT, REQUIRED, and INSTRUCT x REQUIRED into the Display Means for blank. This will provide you with means for each main effect and interaction term. Click Continue. If you were to selectPost-Hoc,SPSSwould run post-hoc analysesfor the main effects but not for the interaction term. Click OK to run the analvsis.

1 .IN S T R U C T FINAL ariable:

Reading the Output

At the bottom of the output, you will find the means 95olo Confidencelnterval for each main effect and inLower Upper Bound teraction you selected with the Mean Std.Error Bound I NS T R U C T 1 .UU 72.240 86.926 79.583 3.445 Optionscommand. 2.00 78.865 93.551 3.445 86.208 three were There 3.00 99.426 3.445 84.740 92.083 instructors, so there is a mean FINAL for each instructor. We 2. REQUIRED also have means for the two values of REQUIRED. FinallY, FINAL ariable: Interval we have six means representing 95%Confidence the interaction of the two Upper Lower Bound Std.Error Rorrnd variables (this was a 3 x 2 REOUIRED Mean ,UU 91.076 78.257 3.007 84.667 design). 1.00 92.801 2.604 81.699 87.250 had Participants who Instructor I (for whom the class of 79.67. Studentswho had Instructor I was not required) had a mean final exam score (for whom it was required)had a mean final exam scoreof 79.50,and so on. The example we just 3. IN S TR U C T' R E QU IR E D ran is called a two-way Variable:FINAL we ANOVA. This is because Interval 95%Confidence had two independent variUpper Lower Bound Bound ables. With a two-way INSTRUCT REQUIRED Mean Std.Error 90.768 68.565 5.208 1.00 .00 79.667 ANOVA, we get three an89. 114 69.886 4.511 1.00 79.500 swers: a main effect for 95.768 73.565 5.208 .00 84.667 2.00 INSTRUCT, a main effect 78.136 97.364 4.511 1.00 87.750 for REQUIRED, and an in100.768 78.565 5.208 .00 89.667 3.00 for result teraction 104.114 84.886 4.511 1.00 94.500 INSTRUCT ' REQUIRED (seetop ofnext page).

70

Chapter6 ParametricInferentialStatistics

Tests of Between-SubjectsEffects Dependent Variable: FINAL

635.8214 5 Intercept 1s1998.893 1 151998.893 1867.691 INSTRUCT 536.357 2 268.179 3.295 REQUIRED 34.32'l 1 34.321 .422 INSTRUCTREQUIRED ' 22.071 2 11.036 .136 Error 1220.750 15 81.383 Total 157689.000 21 Corrected Total 1856.571 20 a. R Squared .342(Adjusted Squared .123) = = R

.000 .065 .526 .874

The source-table above gives us these three answers (in the INSTRUCT, REQUIRED, and INSTRUCT * REQUIRED rows). In the example,none of the main effects or interactions was significant.tn the statements results, of you must indicate two ^F, degreesof freedom (effect and residual/error), the significance'level, and a verbal statement for each of the answers(three, in this case).uite that most statisticsbooks give a much simpler version of an ANOVA sourcetable where the CorrectedModel, Intercept, and CorrectedTotal rows are not included. Phrasing Results That Are Significant If we had obtainedsignificantresults this example, in we could statethe following (Theseare fictitious results.For the resultsthat correspona to the example above, please seethe sectionon phrasingresultsthat are not significant): A 3 (instructor) x 2 (requiredcourse)between-subjects factorial ANOVA was calculated comparingthe final exam scoresfor participants who had one of three instructors and who took the courseeither as a requiredcourseor as an elective. A significant main effect for instructor was found (F(2,15) : l0.ll2,p < .05). studentswho had InstructorI had higher final .*u- r.o16 (m = 79.57, sd:7.96) thanstudents who had Instructor (m:92.43, sd: 3 5.50). Studentswho had Instructor2 (m : g6.43,sd :'10.92) weie not significantlydifferent from eitherof the other two groups.A significant main effect for whetheror not the coursewas requiredwas found (F(-1,15) :3g.44, p < .01)'Students who took the coursebecause was requireddid better(z : it 9l '69, sd : 7.68)thanstudents who took thecourse an electi (m : 77.13, as ve sd:5.72). The interaction not significant was = (F(2,15) l.l5,p >.05). The effect of the instructorwas not influencedby whetheror not the students took the coursebecause was required. it Note that in the above exampre,we would have had to conduct Tukey,s HSD to determinethe differencesfor INSTRUCT (using thePost-Hoc command).This is not nec-

7l

Chapter6 ParametricInferentialStatistics

Tests of Between-Subjects Effects


t Variable:FINAL

Typelll Sumof Source Souares uorrecleoMooel 635.8214 Intercept 151998.893 INSTRUCT 536.357 REQUIRED 34.321 INSTRUCTREQUIRED 22.071 Error 1220.750 Total 157689.000 't856.571 Corrected Total

df

Mean Souare

5 1 2 1 2
1E

127.164 1.563 151998.893 1867.691 268.179 3.295 34.321 .422 11.036 .136 81.383

siq. .230 .000 .065 .526 .874

21 20

= a. R Squared .342(Adjusted Squared .123) = R

The source table above gives us these three answers (in the INSTRUCT, REQUIRED, and INSTRUCT * REQUIRED rows). In the example,none of the main effects or interactions was significant.In the statements results,you must indicatef', two of (effect and residual/error), significance level, and a verbal statethe degreesof freedom ment for each of the answers(three,in this case).Note that most statistics books give a much simpler versionof an ANOVA sourcetable where the CorrectedModel, Intercept, and CorrectedTotal rows are not included. Phrasing Results That Are Signifcant If we had obtainedsignificantresultsin this example, could statethe following we (Theseare fictitious results.For the resultsthat correspond the exampleabove,please to seethe section phrasing on resultsthat arenot significant): A 3 (instructor) x 2 (required course)between-subjects factorial ANOVA was calculated comparingthe final exam scoresfor participants who had one of three instructors and who took the courseeither as a requiredcourseor as an elective. A significant main effect for instructor was found (F(2,15\ : l0.l 12,p < .05). Students who had InstructorI had higher final exam scores : 79.57,sd : 7.96) than students (m who had Instructor3 (m : 92.43,sd : 5.50). Studentswho had Instructor 2 (m : 86.43, sd : 10.92) were not significantlydifferent from eitherof the othertwo groups.A significantmain :38.44, effect for whetheror not the coursewas requiredwas found (F'(1,15) p < .01). Students who took the coursebecause was requireddid better(ln : it (m:77.13, who took the course an elective 91.69, sd:7.68) thanstudents as : 5.72).The interaction : (,F(2,15) I . 15,p > .05).The was not significant sd effect of the instructorwas not influencedby whetheror not the students took it the coursebecause was required. Note that in the above example,we would have had to conduct Tukey's HSD to determinethe differencesfor INSTRUCT (using the Post-Hoc command).This is not nec-

71

Chapter6 Parametriclnferential Statistics

it for essary REQUIREDbecause hasonly two levels(andonemustbe differentfrom the other). ThatAre Not Significant Phrasing Results the so werenot significant, we canstate following: results Our actual factorial ANOVA A 3 (instructor)x 2 (requiredcourse)between-subjects who hadone participants for the wascalculated comparing final examscores or course as an as and of threeinstructors who took the course a required : p (F(2,15) 3.30, was The elective. maineffectfor instructor not significant wasalso course > .05).The maineffectfor whether not it wasa required or = .42,p > .05).Finally, interaction alsonot was the (F(1,15) not significant that neitherthe significant(F(2,15)= .136,p > .05). Thus, it appears effect hasany significant is or instructor whether not thecourse required nor on finalexamscores. Practice Exercise by are if DataSet 2 in AppendixB, determine salaries influenced Using Practice job classification. Write a statesex or sex,job classification, an interactionbetween and mentof results. ANOVA Section6.7 Repeated-Measures Description to the ANOVA extends basicANOVA procedure a withinRepeated-measures providedatafor morethanonelevelof variable (whenparticipants independent subjects / test when more thantwo like a paired-samples variable).It functions an independent levelsarebeingcompared. Assumptions on and variableshould normallydistributed measured an interbe The dependent variable shouldbe from the of val or ratio scale.Multiple measurements the dependent (or participants. same related) Data SP,S^S Format repdatafile should in Eachvariable the SPSS are At leastthreevariables required. an variable'Thus, levelof theindependent variableat a single dependent resent single a four varivariable would require with four levelsof an independent of analysis a design in the SPSS datafile. ables ANOVA effect,usethe Mixed-Design a representsbetween-subjects If any variable instead. command

72

Chapter6 ParametricInferentialStatistics

Runningthe Command sample the uses GRADES.sav Thisexample includesthree data set. Recall that GRADES.sav sets of grades-PRETEST, MIDTERM, and timesduring '@ threedifferent FINAL-that represent the This allowsus to analyze effects t{!idf-!de the semester. of of time on the test performance our sample f,txrd*r gp{|ssriori comparison). tSoar (hencethe within-groups population Click Analyze,then GeneralLinear Model, then

r )

!8*vxido"' $.dryrigp,,,
garlrr! Cffiporur&'..

Measures. Repeated Note that this procedure requires an optional module. If you do not have this command, you do not have the proper module installed. This procedure is NOT included in the student version o/SPSS. After selectingthe command, you will be W,;, -xl presented with the Repeated Measures Define Factor Nanc ffi* . ,,., I Factor(s)dialog box. This is where you identify the Ui**tSubioct qllcv6ls: Nr,nrbar factor (we will call it TIME). Enter 3 l3* B"*, I within-subject cr,ca I for the numberof levels (threeexams) and click Add. Now click Define. If we had more than one Hsl independent variable that had repeatedmeasures, we could enter its name and click Add. You will be presentedwith the Repeated Mea*lxaNsre: I Measures dialog box. Transfer PRETEST, MIDTERM, and FINAL to the lhthin-Subjects Variables section. The variable names should be in to according when they occurred time (i.e., ordered tl the values of the independent variable that they represent).

--J

h{lruc{

l'or I -Ff? l I -9v1"

-ssl

U"a* | cro,r"*.I

nq... I pagoq.. S""a.. I geb*.. I I

Click Options,and SPSSwill computethe meansfor the TIME effect (seeone-way ANOVA for more detailsabouthow to do this). Click OK to run the command.

/J

Chapter6 ParametricInferentialStatistics

Readingthe Output o,.rtput This procedure usesthe GLM command. fil sess Model H &[ GeneralLinear GLM stands "GeneralLinearModel." It is a for Title very powerful command, and many sections of Notes outputarebeyond scope the ofthis text (seeouts Wlthin-SubjectFactor s put outlineat right). But for the basicrepeatedMultivariate Tests measures ANOVA, we are interested only in the Tes{ol Sphericity Mauchly's Eflecls Testsol Wthin-Subjects Testsof l{ithin-SubjectsEffects. Note that the Contrasts Tesisol \ /lthin-Subjects SPSS outputwill includemanyothersections of Testsol Between-Subjects Ellec{s output, whichyou canignoreat thispoint.
Tests of WrllrilFstil)iects Effecls

Measure: MEASURE 1 Type Sum lll ofSouares time Sphericity Assumed 5673.746 Gre e n h o u s e -Gee r iss 5673.746 Huynh-Feldt 5673.746 Lower-bound 5673.746 Error(time)SphericityAssumed 930.921 Greenhouse-Geisser 930.921 Huynh-Feldt 930.921 Lower-bound 930.921 Source
df

2 1.211 1.247 1.000 40 24.218 24.939 20.000

Mean Souare 2836.873 4685.594 4550.168 5673.746 23.273 38.439 37.328 46.546

F 121.895 121.895 121.895 121.895

Siq.

000 000 000 000

The TPesls l{ithin-Subjects of Effectsoutput shouldlook very similar to the output from the otherANOVA commands. the aboveexample, effect of TIME has an F In the value of 121.90with 2 and 40 degrees freedom(we use the line for Sphericity Asof sumed). is significantat less than the .001 level. When describing It results,we these should indicate typeof test,F value, the level. and of degrees freedom, significance Phrasing Results ThatAre Significant Because ANOVA results the weresignificant, needto do somesortof post-hoc we analysis. One of the main limitationsof SPSSis the difficulty in performingpost-hoc analyses within-subjects for factors. With SPSS, easiest solutionto this problemis to the protecteddependent testswith repeated-measures conduct t ANOVA. Thereare more powerful(andmoreappropriate) post-hoc will not compute them for analyses, SPSS but us.For moreinformation, your instructor a moreadvanced statistics text. consult or To conductthe protectedt tests,we will comparePRETESTto MIDTERM, MIDTERM to FINAL, andPRETEST FINAL, usingpaired-samples Because we I tests. to areconducting threetests and,therefore, inflatingour Type I error rate,we will usea significance levelof .017(.05/3) instead .05. of

74

Chapter6 ParametricInferentialStatistics

Paired Samples Test

Mean Yatr 1 l,KE I E5 |

Paired Differences 950/o Confidence Interval the of l)ifferance Std.Error std. Dcvietior Lower UoDer Mean
3 .9 6 41

df

sig {2-tailed)

-15.2857
M IDT ERM Pai 2 P a i r3 PRETEST . F INAL M IDT ERM . F INAL

.8650

-17.0902 - 1 3 . 4 8 1 3 -18.7239 4.5264

-'t7.670 -11.646 -5.236

20 20 20

.000 .000 .000

-22.8095 -7.5238

8.97s6 6.5850

1 . 9 5 8 6 -26.8952 1.4370 -10.521 3

The three comparisons eachhad a significancelevel of less than.017, so we can concludethat the scoresimproved from pretestto midterm and again from midterm to final. To generatethe descriptive statistics, we have to run the Descriptivescommand for eachvariable. Because resultsof our exampleabovewere significant,we could statethe folthe lowing: A one-way repeated-measures ANOVA was calculated comparingthe exam scoresof participants three different times: pretest,midterm, and final. A at significant effect was found (F(2,40) : 121.90,p < .001). Follow-up protectedI testsrevealedthat scoresincreased significantlyfrom pretest(rn : (m:78.62, sd:9.66), andagainfrom midterm 63.33, sd: 8.93)to midterm = 86.14, to final (m sd:9.63). Phrasing Results That Are Not Significant With resultsthat are not significant, could statethe following (the F valueshere we havebeenmadeup for purposes illustration): of ANOVA was calculatedcomparingthe exam A one-way repeated-measures scoresof participants threedifferent times: pretest,midterm, and final. No at significant effect was found (F(2,40) = 1.90, p > .05). No significant (m: 63.33, = 8.93),midterm(m:78.62, sd difference existsamongpretest sd:9.66), andfinal(m:86.14, sd: 9.63)means. Practice Exercise Use PracticeData Set 3 in Appendix B. Determine if the anxiety level of participants changedover time (regardless which treatmentthey received) using a one-way of repeated-measures ANOVA and protecteddependent tests.Write a statement results. I of Section 6.8 Mixed-Design ANOVA Description The mixed-designANOVA (sometimescalled a split-plot design) teststhe effects of more than one independentvariable. At leastone of the independentvariables must

75

Chapter6 ParametricInferentialStatistics

(repeated be within-subjects measures). leastone of the independentvariables must be At between-subjects. Assumptions The dependent variable shouldbe normally distributedand measured an interon val or ratio scale. SPSS Data Format The dependent variable shouldbe represented one variable for each level of the as within-subjects independentvariables.Anothervariableshouldbe present the datafile in x 2 mixed-designANOVA would require for each between-subjects variable. Thus, a 2 three variables,two representing dependent variable (one at each level), and one repthe resenting between-subjects the independentvariable. Running the Command The GeneralLinear Model commandruns the Mixed-Design ANOVA command. Click Analyze, then General Linear Model, then Repeated Measures. Note that this procedure requires an optional module. If you do not have this command, you do not have the proper module installed. This procedure is NOT included in the st u dent versi on o/,SPSS.
lAnqtr:l g{hs Udear Add-gnr Uhdorn Rqprto r oitc|hfiy. $rtBtlca i T.d*t
Cocpa''UF46 ) )

ilrGdtilod6b

iifr.{*r &rfcrgon tAdhraf

} l i

The RepeatedMeasures command shouldbe used if any of the independent gfitsrf*|drvdbbht IK variables are repeatedmeasures(withinter I subjects). s*i I This example also uses the t*., GRADES.sav data file. Enter PRETEST, 34 1 MIDTERM, and FINAL in the lhthinSubjects Variables block. (See the RepeatedMeasuresANOVA command in Section 6.7 for an explanation.) This exampleis a 3 x 3 mixed-design. There are two independent variables (TIME and INSTRUCT), each with three levels. Uoo*..I cotrl... I 8q... I Podg*. I Sm.. | 8e4.. I We previously enteredthe information for TIME in the RepeatedMeasures Define Factors dialog box. We needto transferINSTRUCT into the Between-Subjects Factor(s) block. Click Options and selectmeansfor all of the main effectsand the interaction (see one-way ANOVA in Section6.5 for more details about how to do this). Click OK to run the command.

mT-

76

Chapter6 ParametricInferentialStatistics

Readingthe Output provides measures command, GLM procedure a repeated the As with the standard ANOVA, we are interested two secin we will not use.For a mixed-design lot of output Effects. tions.The first is Tests Within-Subjects of
Eflects Tests ot Wrtlrilr-Suuects M eas u re : Type Sum lll of Squares sphe ricity Assumed 6 5673.74 Gre e n h o u s e -Ge i s s el 5673.746 Huynh-Feldt 6 5673.74 Lower-bound 6 5673.74 time' instruct Sphericily Assumed 806.063 0 re e n h o u s e -Ge i s s e r 806.063 Huynh-Feldt 3 806.06 Lower-bound 806.063 Erro(time) Sphericity Assumed 124857 Gre e n h o u s e - i s s e r Ge 124.857 Huynh-Feldt 124.857 Lower-bound 124.857 S our c e time
df

1.181 1.356 1.000 4 2.363 2.f' t2 2.000 36 21.265 24.41'.| 18.000

Mean ouare S 2836.873 4802.586 4' t83.583 5673.746 201 6 .51 .149 341 297.179 403.032 3.468 5.8 71 5.1 15 6.S 37

sis
000 000 000 000 .000 .000 .000 .000

817.954 817.954 817.954 817.954 58.103 58.103 58.103 58.103

This sectiongives two of the threeanswerswe need (the main effect for TIME and the interactionresult for TIME x INSTRUCTOR). The secondsectionof output is Testsof Between-subjects Effects (sample output is below). Here, we get the answersthat do not contain any within-subjects effects. For our example, we get the main effect for INSTRUCT. Both of thesesectionsmust be combinedto oroduce the full answer for our analysis.
Eltecls Testsot Betryeen-Sill)iects Me a s u re :EA SU R1 M E ransformed Variable: Type Sum lll S o u rc e of Squares 9 Intercepl 3 6 4 1 2 .0 6 3 inslrucl 18 .6 9 8 En o r 43 6 8 .5 7 1

df
I

2 18

F Mean ouare S 3641 92.063 1500.595 9.349 .039 242.6S 8

Siq.

.000 .962

If we obtain significant effects, we must perform some sort of post-hoc analysis. postAgain, this is one of the limitations of SPSS.No easyway to perform the appropriate (within-subjects)factors is available. Ask your instructor hoc test for repeated-measures for assistance with this. of When describingthe results,you should include F, the degrees freedom,and the In significancelevel for eachmain effect and interaction. addition,somedescriptivestaor tistics must be included(eithergive means includea figure). Phrasing Results That Are Significant There are three answers(at least) for all mixed-designANOVAs. Pleasesee Section 6.6 on factorial ANOVA for more detailsabouthow to interpretand phrasethe results.

77

Chapter6 Parametric InferentialStatistics

For the aboveexample,we could statethe following in the resultssection(note that this assumes post-hoctestshave beenconducted): that appropriate A 3 x 3 mixed-design ANOVA was calculated examinethe effectsof the to instructor(Instructors1,2, and,3)and time (pretest,midterm, and final) on scores.A significant time x instructor interactionwas present(F(4,36) = 58.10, p <.001). In addition,the main effect for time was significant (F(2,36): 817.95,p < .001). The main effect for instructor was not (F(2,18): .039, > .05).Upon examination the data,it appears significant p of that Instructor3 showedthe most improvement scores in over time. With significant interactions, it is often helpful to provide a graph with the descriptive statistics.By selectingthe Plotsoption in the main dialog box, you can make graphsof the interaction like the one below. Interactionsadd considerable complexityto the interpretation statisticalresults.Consult a research methodstext or ask your instrucof tor for more help with interactions.
ilme

E@rdAs

I, I lffi---_ sw&t.tc
-

f Crtt"-l

-l

cd.r_J
x E i I

o0

-z
-3

-==s.Cr&Plob:

L:J r---r
-

.
I !

ia
E

, ,,,:;',,1. "l ","11::::J

gr! I

E
UT
t

2(Il

inrt?uct

Phrasing Results That Are Not SigniJicant If our resultshad not beensignificant,we could statethe following (note that the .F valuesare fictitious): A 3 x 3 mixed-design ANOVA was calculated examinethe effectsof the to (lnstructors1,2, and 3) and time (pretest,midterm, and final) on instructor scores.No significant main effects or interactions were found. The time x (F(4,36) = 1.10,p > .05), the main effect for time instructor interaction (F(2,36)= I .95,p > .05),andthe main effectfor instructor (F(2,18): .039, p > .05) were all not significant.Exam scoreswere not influencedby either time or instructor. Practice Exercise Use PracticeData Set 3 in AppendixB. Determineif anxiety levels changed over time for each of the treatment(CONDITION) types. How did time changeanxiety levels for eachtreatment? Write a statement results. of

78

Chapter6 ParametricInferentialStatistics

Section of 6.9 Analysis Covariance Description


Analysis of covariance(ANCOVA) allows you to remove the effect of a known covariate. In this way, it becomesa statisticalmethod of control. With methodological controls (e.g., random assignment),internal validity is gained.When such methodological controlsare not possible,statisticalcontrolscan be used. ANCOVA can be performed by using the GLM command if you have repeatedmeasures factors. Becausethe GLM commandis not included in the Base Statisticsmodule, it is not includedhere.

Assumptions with the dependANCOVA requires the covariatebe significantly that correlated ent variable.The dependent variableandthecovariateshould at theinterval or ratio be levels. addition. In bothshould normallydistributed. be SPS,S Data Format onevariable eachindependent for The SPSS variable,one datafile mustcontain variable variable,andat least covariate. representing dependent one the Running the Command The Factorial ANOVA commandis tsr.b"" GrqphrUUnbr,wrndowl.lab usedto run ANCOVA. To run it, click AnaReports lyze, then General Linear Model, then UniDascrl$lve St$stks variate. Follow the directionsdiscussed for factorial ANOVA, using the HEIGHT.sav sample data file. Placethe variableHEIGHT as your DependentVariable.Enter SEX as your Fixed Factor, then WEIGHT as the CoDogrdrfvdirt{c variate.This last stepdetermines differthe t{od.L. I l-.:-r [7mencebetween regularfactorialANOVA and c**-1 ANCOVA. Click OK to run theANCOVA. llr,.. I
|'rii l.ri I

till
Crd{'r.t{r}

*l*:" I
q&,, I

td
f-;-'l wlswc0't
l '|ffi

j.*.1,q*J*lgl

79

Chapter6 Parametric InferentialStatistics

Reading the Output The outputconsists onemain source of table(shownbelow).This tablegivesyou the main effects and interactionsyou would have receivedwith a normal factorial ANOVA. In addition, thereis a row for eachcovariate. our example, haveonemain In we effect(SEX)andonecovariate(WEIGHT). Normally,we examine covariateline only the to confirmthatthecovariateis significantly related thedependent to variable. Drawing Conclusions This sample analysis was performed determine malesand females to if differ in height, afterweightis accounted We knowthatweightis related height. for. to Rather than matchparticipants usemethodological or we controls, canstatistically remove effectof the weight. When giving the results ANCOVA, we must give F, degrees freedom, of of and significance levelsfor all main effects, interactions,and covariates. main effectsor If interactions are significant,post-hoctests must be conducted. Descriptive statistics (meanandstandarddeviation)for eachlevelof theindependent variable should alsobe given.
Tests of Between-SubjectsEffects Va ri a b l e :H EIGH T Typelll Sumof Mean Source S orrare Souares df 9orrecreoMooel 215.0274 1U /.D 'tJ 2 lntercept 5.580 I 5.580 WEIGHT 1 1 9.964 1't9.964 1 SEX 66.367 66.367 1 Error 1 3.91 1 13 1.070 Total 7 1 9 1 9.000 16 Corrected Total 228.938 15 a. R Squared .939(Adjusted Squared .930) = = R

Siq.

100.476 5.215 112.112 62.023

.000 .040 .000 .000

PhrasingResults ThatAre Significant Theabove example obtainedsignificant a result, wecould so state following: the
A one-way between-subjects ANCOVA wascalculated examine effect to the of sexon height, covarying theeffectof weight. out Weightwassignificantly : related height (F(1,13) ll2.ll,p <.001). The maineffectfor sexwas to : 62.02,p< .001), significant (F(1,13) with males significantl! taler (rn: 69.38, sd:3.70) thanfemales = 64.50, = 2.33\. (m sd Phrasing Results ThatAre Not Significant If the covariateis not significant, needto repeat analysis we the withoutincluding thecovariate(i.e.,run a norrnal ANOVA). For ANCOVA results that are not significant, you could statethe following (note thattheF values made for thisexample): are up

80

Chapter6 ParametricInferentialStatistics

to ANCOVA was calculated examinethe effect A one-waybetween-subjects of sex on height,covaryingout the effect of weight. Weight was significantly related height(F(1,13)= ll2.ll, p <.001). The main effectfor sexwas not to = taller (F'(1,13) 2.02,p > .05),with malesnot beingsignificantly significant (m : 69.38, sd : 3.70) than females (m : 64.50, sd : 2.33\, even after covaryingout the effect of weight. Practice Exercise Using Practice Data Set 2 in Appendix B, determine if salariesare different for males and females.Repeatthe analysis,statisticallycontrolling for years of service.Write of a statement resultsfor each.Compareand contrastyour two answers.

of Section 6.10 MultivariateAnalysis Variance(MANOVA) Description


Multivariatetestsare thosethat involve more than one dependentvariable. While it is possible to conduct severalunivariatetests (one for each dependent variable), this causes Type I error inflation.Multivariatetestslook at all dependentvariables at once, in much the sameway that ANOVA looks at all levels of an independent variable at once.

Assumptions
that MANOVA assumes you havemultipledependentvariables that are relatedto on each other. Each dependent variable should be normally distributedand measured an interval or ratio scale. SPSS Data Format The SPSSdatafile shouldhavea variablefor eachdependentvariable. One addiindependent variable. It is also postional variable is required for eachbetween-subjects MANOVA and a repeated-measures sible to do a MANCOVA, a repeated-measures additionalvariablesin the data file. require MANCOVA as well. Theseextensions Running the Command Note that this procedure requires an optional module. If you do not have this command, you do not have the proper module installed. This procedure is NOT included in the student version o/SPSS. SAT and GRE scoresfor 18 participants.Six particiThe following data represent pants receivedno specialtraining, six receivedshort-termtraining before taking the tests, and six receivedlong-term training. GROUP is coded0 : no training, I = short-term,2 = lons-term.Enter the dataand savethem as SAT.sav.

8l

Chapter6 Parametriclnferential Statistics

SAT 580 520 500 410 650 480 500 640 500 500 580 490 520 620 550 500 540 600

GRE 600 520 510 400 630 480 490 650 480 510 570 500 520 630 560 510 560 600

GROUP 0 0 0 0 0 0 I I I I I I 2 2 2 2 2 2

gaphs t$ltt6t Ad$tfrs Locate the Multivariate commandby ffi; I RrW clickingAnalyze, thenGeneral LinearModel, ) : , Wq$nv*st*rus thenMultivariate. t i ,1$lie ) ) )
,

WNry

(onddc

Bsrysrsn Lgdrcs

Iqkrcc cocporpntr'.,

This will bring up the main dialog box. Enter the dependentvariables (GRE Variin and SAT, in this case) theDependent Enter the independent variables blank. able(s)(GROUP,in this case)in the Fixed Factor(s)blank. Click OK to run the command.

Readingthe Output We are interested two primarysections output.The first onegivesthe results in of of the multivariate tests.The sectionlabeledGROUPis the one we want. This tells us whether variables.Fourdifferenttypesof GROUPhadan effecton anyof our dependent Thus,the anmultivariate results given.The mostwidely usedis Wilks'Lambda. test are Thatvalue freedom. swerfor theMANOVA is a Lambda .828,with 4 and28 degrees of of is not significant.

82

Chapter6 ParametricInferentialStatistics

hIIltiv.T i.ile Teslsc Efiecl Value F Hypothesis df Enordf sis. Intercept Pillai'sTrace 87. .988 569.1 2.000 4.000 000 Wilks'Lambda .012 569.1 97. 2.000 4.000 000 Hotelling's Trace 81.312 569.187: 2.000 4.000 000 Roy's Largest Root 81.31 2 569.1 87! 2.000 4.000 000 group Pillai's Trace 1t4 .713 4.000 30.000 590 Wilks'Lambda 828 .693: 1.000 28.000 603 Holelling's Trace .669 206 4.000 26.000 619 't.469b Roy's Largest Rool 196 2.000 15.000 261 a. Exact statistic b. Theslatistic an upperbound Fthatyieldslowerbound is on a onthesignificance level. c. Design: Intercept+group

The secondsectionof output we want gives the resultsof the univariatetests (ANOVAs)for eachdependent variable.
Tests ot BelweerFs[l)iectsEflects lll Type Sum S our c e D e p e n d e n ta ri a b l e of Squares V df Conecled Model sat 3077.778' 2 gre 5200.000b 2 Intercept sat 5 2 05688.889 1 gre 5 2 48800.000 1 group sat 307t.t78 2 gre 5200.000 I Enor sal 64033.333 15 gre 66400.000 15 Total sat 5272800.000 18 g te 5320400.000 18 4' Corrected Total sat ta 67111.111 qre 71600.000 17 a. R S qu a re= .0 4 6(A d j u s te d S q u a re= -.081) d R d b. R S qu a re d = .0 7(Ad j u s te dSq u a re d=051) 3 R -

Mean Suuare

F 1538.889 .360 2600.000 .587 5205688.889 1219.448 5248800.000 1185.723 1538.889 .360 2600.000 .597 4268.889 4426.667

Sio.

.70 3 .568 .000 .000 .703 .568

Drawing Conclusions We interpret results the univariate the of testsonly if the group Wilks'Lambdais significant. Our results not significant, we will first consider are but how to interpret resultsthataresignificant.

83

Chapter6 Parametric InferentialStatistics

Phrasing Results ThatAre Significant


If we had received the following output instead, we would have had a significant MANOVA, and we could statethe following:
llldbl.Ielrc

Ertscl Inlercepl PillaIsTrace Wilks'Lambda Holsllln0's Tcs RoYsLargssl Root


group piilai's Trec Wllks'Lambda HolellingbTrace

value

d]

Frr6t

dl

Sr d

.989 .0tI 91.912 91.912 .579 .123

613.592. 6a3.592. 6t3.592. 613.592. 1 n{o 3.157. 1.t01 't0.t25r

Roy's Largsst Rool t .350 a Exatlslalistic b. Thestaustic an uppsr is on lml bound F halyisldsa lffisr bound thesiqnitcancB on

2.000 2.000 2 000 2.000 | 000 I 000 I 000 2.000

t.000 r.000 1.000 {.000 30.000 28.000 26.000 I 5.000

.000 .000 .000 .000 032 011 008 002

c. Desi0n: Intercepl+0roup

Ieir
Sourca corected todet OsoandahlVa.i sat gf8

oaSrtrocs|alEl. Type gum lll


dl

Effoda
xean

htorcpt
gfoup

Sat ue
sal ore

Erol
T0lal

sat 0rs
ote

62017.tt8' 963tt att! 5859605.556 5997338.889 620tt.t78 863aral a 6121 667 6 6S| |6 667 5985900.000 6152r00 000
126291.444 151761 1|1

2 2

2
15 15 18

31038.889 t.250 13'1t2.222 9.r 65 5859605.556 I 368.71 1 5997338.8S9I 31 4.885 |.250 31038.089 431t2.222 9.t65
4281.11 ',l 1561.,|'rr

si0 006 002 000 000 006 002

Corcled Tolal

sal 0re

l7

A one-way MANOVA wascalculated the examining effectof training(none, short-term, long-term) MZand GREscores. significant on A effectwasfound = (Lambda(4,28) ,423, p: .014).Follow-up ANOVAs indicated univariate thatSATscores weresignificantly p improved training(F(2,15): 7.250, : by .006).GRE scores were also significantly improvedby training(F(2,15): = '002). 9'465, P Phrasing Results That Are Not Significant The actual presented not significant. example was Therefore, couldstate folwe the lowingin theresults section: A one-way MANOVA wascalculated the examining effectof training(none, short-term, long-term) SIZ andGREscores. significanteffect was or on No = found (Lambda(4,28) .828, > .05).NeitherSAT nor GRE scores p were significantly influenced training. by

84

7 Chapter

InferentialStatistics Nonparametric
procedure inapis parametric tests Nonparametric areusedwhenthecorresponding the propriate. Normally,this is because dependentvariable is not interval- or ratioIf variable is not normallydistributed. the the because dependent scaled.It can alsobe statistics alsobe appropriate. may nonparametric counts, are dataof interest frequency of Section7.1 Chi-square Goodness Fit Description proportions whetheror not sample goodness fit testdetermines of The chi-square if it to values. example, couldbe used determine a die is "loaded" For matchthe theoretical of born with birth defects the It couldalsobe usedto compare proportion children or fair. has if value (e.g.,to determine a certainneighborhood a statistically to the population rate higher{han-normal of birth defects). Assumptions aboutthe shape Thereareno assumptions We needto makevery few assumptions. for shouldbe at leastl, andno frequencies eachcategory The of the distribution. expected frequencies lessthan5. of haveexpected should morethan20%o thecategories of Data Format SP,S,S variable. requires onlya single SPSS Runningthe Command The following data We will createthe following dataset and call it COINS.sav. T as represent flippingof eachof two coins20 times(H is coded heads, astails). the C o i NI: H T H H T H H T H HHTTTHTHTTH C O i N 2: T H H T H T H TTHHTH HTHTHH T COINI andCOIN2,andcodeH as I andT as 2. The data Namethe two variables calledCOINI andCOIN2. will file thatyou create have20 rowsof dataandtwo columns,

85

Chapter7 NonparametricInferentialStatistics

To run the Chi-Square comman{ ) click Analyze,then Nonparametric Tests, ,le*l 1-l RePsts Dascri*hrcststistict I then Chi-Square. This will bring up the :Y,:f l Cdmec lihnt main dialog box for the Chi-Square Test. t 6nsrdlhcir$,lo*l t Csrd*c Transfer the variable COINI into , RiErcrskn the Test Variable List. A "fair" coin has I d.$dfy an equal chance of coming up heads or , O*ERadwllon t ft.k tails. Therefore, we will leave the E'"r@ 5*b, pected Values setto All categories equal. ) tkfta We could test a specific set of } Qudry{onbd ROCCt vi,,. proportions by entering the relative freTail quencies in the Expected Values area. Click OK to run the analysis.
i Haad

i***- xeio'
.r...X

akrdoprdc* 5unphg.,. KMgerdant Sffrpbs,,, 2Rddldtudcs",, KRd*ad Ssrpbs,,.

pKl n3nI iryd I


' i E$cOrURutClc 6 Garrna*c ; r (. Usarp.dirdrarr0o
l : i : . i) iir ' l i r t , t , i, t I I

3Ll

Reading the Output The output consistsof two sections. The first sectiongives the frequencies (observed1f) of each value of the variable. The expectedvalue is given, along with the differenceof the observed from the expectedvalue (called the residual).In our example, with 20 flips of a coin, we shouldget l0 of eachvalue.
Test Statistics
coinl unt-uquareo df Asymp. Sig.

cotNl
Observed Expected Residual N N 1. 0 11 10.0 10.0 9 -1. 0 20

Head Tail Total

The second section of the output gives the resultsof the chisquaretest.
1

.200 .655

a. 0 cells(.0%)haveexpected frequencies than less 5. The minimum expected cellfrequency 10.0. is

86

Chapter7 NonparametricInferentialStatistics

Drawing Conclusions A significant chi-square indicates the datavary from test that the expected values. A testthatis not significani indicates thedataareconsistent the that with expected values. Phrasing ResultsThat Are Significant In describing results, should the you state v.alue chi-square the of (whose symbolis thedegrees.of 1'1' dt.aot, it.'rignin.ance level,anaa description of the results. exFor ample'with a significant chi-square (for,a sample'different from the example above, such asif we hadused "loaded"die),we courdstati a the foilowing: A chi-square goodness fit testwascalculated of comparing frequency the of occurrence each of valueof a die.It washypothesized eachvalue that would occuran equalnumber times. of Significanideviation trtr rtypothesized nor values found :25.4g,p i.OS).Thedie was fXlSl appears be.,loaded.,, to Notethatthisexample hypothetical uses values. Phrasing Results ThatAre NotSignificant If the analysis produces significant no difference, in the previousexample, as we couldstate following: the A chi-square goodness fit testwascalculated of comparing the of occulrence heads tailson a coin.It washypothesized frequency of and thateachvalue would occur an equalnumberof times.No significant deviationfrom the hypothesized values wasfound(xre): .20,p rlOsl. The coin apfears to be fair. Practice Exercise
Use Practice Data Set 2 in Appendix B. In the entire population from which the sample was drawn , 20yo of employees are clerical, 50Yo are technical, and 30%oare profes-

drawnconformsto these values. HINT: Enter sional.Determinewhetheror not thesample "ExpectedValues" in of therelative proportions thethreesamples order(20, 50, 30) in the
afea.

Section 7.2 Chi-Square Test of Independence Description The chi-squaretest of independence tests whether or not two variables are independent of each other. For example, flips of a coin should be independent events, so knowing the outcome of one coin toss should not tell us anything about the secondcoin toss.The chi-squaretest of independence essentiallya nonparametricversion of the inis teraction term in ANOVA.

87

Chapter7 NonparametricInferentialStatistics

Drawing Conclusions A significant chi-squaretest indicatesthat the data vary from the expectedvalues. A test that is not significantindicatesthat the dataare consistent with the expectedvalues. Phrosing Results That Are Significant In describing results, you shouldstatethe value of chi-square the (whosesymbolis th" degrees freedom,the significance level, and a descriptionof the results.For exof 1t;, ample, with a significant chi-square(for a sampledifferent from the example above,such as if we had useda "loaded"die), we could statethe following: A chi-square goodness fit test was calculated of comparingthe frequencyof occunenceof eachvalue of a die. It was hypothesized that eachvalue would occur an equal numberof times. Significantdeviationfrom the hypothesized < .05).The die appears be "loaded." valueswas foundC6:25.48,p to Note that this example useshypothetical values. Phrasing Results That Are Not Significant If the analysisproduces significantdifference, in the previousexample,we no as could statethe following: A chi-square goodness fit test was calculated of comparingthe frequencyof occurrence headsand tails on a coin. It was hypothesized of that eachvalue would occur an equal number of times. No significant deviation from the hypothesized valueswas found (tttl: .20,p > .05).The coin appears be to fair. Practice Exercise Use PracticeData Set 2 in Appendix B. In the entire populationfrom which the samplewas drawn, 20o of employees clerical, 50Yoaretechnical, and30o/o profesare are sional. Determinewhetheror not the sampledrawn conformsto thesevalues.HINT: Enter the relativeproportions the threesamples order(20, 50,30) in the "ExpectedValues" of in area.

Section 7.2 Chi-Square Testof Independence Description


The chi-squaretest of independence tests whether or not two variables are independentof each other. For example,flips of a coin should be independent events, so knowing the outcome of one coin toss should not tell us anything about the secondcoin toss.The chi-squaretest of independence essentiallya nonparametricversion of the inis teraction term in ANOVA.

87

Chapter7 NonparametricInferentialStatistics

Assumptions
aboutthe Very few assumptions needed. are For example,we make no assumptions shapeof the distribution. The expectedfrequencies each categoryshould be at least l, for of and no more than 20o/o the categories of shouldhave expectedfrequencies lessthan 5. Data Format .SP,SS At leasttwo variablesare required. Running the Command The chi-square test of independenceis a llfr}1,* qrg" l.[fltias llllnds* Heb component of the Crosstabscommand. For more Rpports fraqrerrhs,,, details, see the section in Chapter 3 on frequency Compalil&nllt distributionsfor more than one variable. This exampleusesthe COINS.savexample. ' 6nn*rEll."f,rsarMMd ConeJat* COINI is placed in the Row(s) blank, and COIN2 R*lo;,. Reqn*e$uh is placedin the Column(s)blank. P+ P{att,,. Clarsfy
. OaioRrductlon Scala

*QPlotr,.,

*:51

siJ

.!rd I
H'b I

Click Statistics. then check the Chi- p&ffi square box. Click Continue. You may also , x\iref CedilgacycodlU*r want to click Cel/s to select expected fre, [- FiadCrrm#rv quenciesin addition to observedfrequencies, f Llrnbda as we will do on the next page.Click OK to rvYY ruTa' run the analysis. -Ndridblrttavc -f-Eta
lCoc*rgrfs '| '.r '' 'ld

$,iii1ilr'tirir ':l
|- Cqrddionc -*-, Or*ut i il* larna , : i f- lalar'd f Kcnddrldr! -

.w . l
H +l

i lrYa'*o

'. f[qea , TnFr f- Uatcru - *,


d.ridbs l t-

l{do}Hlcse6l .::'t':,:;,

88

Chapter7 NonparametricInferentialStatistics

Readingthe Output The output consistsof two parts.The first part gives you the counts.In this example,the actual are frequencies shown and expected with the because they were selected Cel/soption.
Crosstabulatlon COINI' GOIN2

corN2
Head Tail
4

Total 11 11.0
Y

golNl

Fleao

Tail

I !{h I l?ry1!,,, ?r{t$t,,,.l


!
Ccrr*c ;l? 0b*cvcd 1v rxpccico
N

Total

uounl Expected Count Count Expected Count Count Expected Count

6.1
4 5.0 11 11.0

EN

4.1 I on

9 .0 20 20.0

fq.rrt ,--! : Cr"* fron

I | I

P"cc*$or il* npw ;l" Colrm t|* ral

1.,..-...-..-..--,..----".*.J i.-*-..--.--.--.'***--*--***-l - ' - ""- "- ^ ' - " ' - "- l : N o d r l c $a W{ t*t l^ 8o.ndc..trradtr r no,"u.or"or*r i i l. Innc*cr*vralrf* I f Trurlccalcorntr | I r Xor4.r*rprr I

Rrdd'Alc i if uttlaltddtu I lf $.n*rtu I ll* n4,a"e*aru*Auco

i I I I

Note that you can also use the Cells option to of display the percentages each variable that are each value. This is especiallyuseful when your groups are different sizes. The secondpart of the output gives the results test.The most commonlyusedvalue of the chi-square is the Pearson chi-square, shown in the first row (valueof .737).

Chi-SquareTests Value L;nr-!'quare Pearson Correctiorf Continuity Ratio Likelihood Fisher's ExactTest Linear-by-Linear Association N of ValidCases
{3t" df
I

Asymp.Sig (2-sided) 1
1

ExactSig, (2-sided)

ExactSig. (1-sided)

165 740

.391 .684 .390


.653

.342

700 20

.403

a' ComputedonlYfor a 2x2lable countis 4. b. 3 cells(75.0%)haveexpected expected countlessthan 5. The minimum 05.

Drawing Conclusions are that the two variables not indetest A significantchi-square resultindicates do that indicates the variables not vary significantly pendent. valuethatis not significant A from independence.

89

Chapter7 NonparametricInferentialStatistics

Phrasing ResultsThatAre Significant of the you shouldgive the valueof chi-square, degrees In describing results, the with a sigFor of the results. example, level,and a description freedom, significance the we above), couldstate (for nificantchi-square a datasetdifferentfrom the one discussed thefollowing: of the comparing frequency A chi-square of independence calculated was test was interaction found(f(l) : heartdisease menandwomen. significant in A (68%)thanwere p 23.80, < .05).Men weremorelikely to get heartdisease women(40%\ sex, assumes a testwasrun in whichparticipants' as that Notethatthis summary statement was well aswhether not theyhadheartdisease, coded. or Phrasing Results ThatAre Not Significant dependthat indicates thereis no significant A chi-square thatis not significant test Therefore, abovewasnot significant. enceof onevariable the other.The coin example on we couldstate following: the
was calculatedcomparing the result of A chi-squaretest of independence was found (I'(l) = .737,p> flipping two coins.No significant relationship

events. .05).Flipsof a coinappear be independent to Practice Exercise are wantsto knowwhether not individuals morelikely to helpin an or A researcher who Of when they are indoorsthan when they are outdoors. 28 participants emergency and who wereindoors,8helped wereoutdoors, helped 9 did not.Of 23 participants 19 and by is 15 did not. Enterthesedata,and find out if helpingbehavior affected the environwere problemis in the dataentry.(Hint: How many participants ment.The key to this participant?) there, whatdo you knowabouteach and Section7.3 Mann-Whitnev UTest Description t test. of equivalent the independent The Mann-Whitney testis thenonparametric U The distribution. Mannwhether not two independent are It tests or samples from thesame weaker independenttest,and the / testshouldbe usedif you t WhitneyU testis thanthe canmeetits assumptions. Assumptions the of The Mann-Whitney testuses rankings the data.Therefore, datafor the U the of aboutthe shape the dismustbe at leastordinal. Thereareno assumptions two samples tribution.

90

!il;tIFfFF".

Chapter7 Nonparametric InferentialStatistics

,SPS,S Data Format This command requires single variable representing dependent a the variableand a second variable indicating groupmembership.
Running the Command This example will use a new data file. It represents participants a series l2 in of races. There were long races, medium races, and short races. Participants either had a lot of experience (2), some (l), (0). experience or no experience Enter the data from the figure at right in a new file, and savethe data file as RACE.sav. The values for LONG. MEDIUM, and SHORT represent the results of the race, with I being first place and 12 being last.
WMw I Andy* qaphr U*lx , R.po.ti Doroht*o1raru.r , t Cd||olaaMcrfE | 6on ralthccl'lodd I ) C*ru|*c j r Rooa!33hn ') cta*iry i j ) DatrRoddton i I lbb

[6 gl*l

:@e@ J llrsSarfr
I I Q.s*y Cr*rd BOCCuv.,,,

'

Sada

) t t O$squr.,.. thd|*rl... Rur... 1-sdTpb X-5,.. rl@ttsstpht,., e Rlnand5{ndnr,,, KRdetcdsrmplo*., ,

nl nl bl

'

n]'

rf

3 --ei--' 2:

To run the command,click Analyze, then Nonparametric Tests, then 2 Independent Samples. This willbring up the maindialog box. Enter the dependentvariable (LONG, for this example) in the Test Variable List blank. Enter the independentvariable (EXPERIENCE)as the Grouping Variable.Make surethat Mann-WhitneyU is checked. Click Define Groupsto selectwhich two groups you will compare.For this example,we will compare those runners with no experience(0) to those runners with a lot of experience(2). Click OK to run the analvsis.

liiiriil',,, -5J

IKJ
],TYTI
GnuphgVadadr:

| &r*!r*l
C.,"* |

ryl

fffii0?r-*
l ; ..' ,''r 1,

fry* |

l*J

iTod Tiryc**il7 Mxn.\dWn6yu

I_ Ko&nonsov'SrnirnovZ I l- Mosas a<tcnproactkmc l- W#.Wdrowfrznnrg

9l

Chapter7 NonparametricInferentialStatistics

Reading the Output The output consistsof two sections. The first section givesdeexDenence scriptive statisticsfor the two sam- rong .uu 2.OO ples. Because data are only rethe Total quiredto be ordinal, summaries relatingto their ranksareused.Those participants who had no experience 6.5 averaged as their placein the race.Thoseparticipants with a lot of experience averaged astheirplacein therace. 2.5 The second section the outputis the resultof of the Mann-WhitneyU test itself. The value obtained was0.0,with a significance levelof .021.
Ranks
N
4

Mean Rank 4 8

Sum of Ranks
Zb.UU

o.cu 2.50

10.00

Test Statisticsb lono

Mann-wnrrney u Wilcoxon W

z
Asymp. (2-tailed) Sig. Exact Sig.[2'(1 -tailed

.000 10.000 -2.309 .021 .029


a

sis.)l

a. Notcorrected ties. for b. Grouping Variable: experience

Drawing Conclusions A significant Mann-Whitney resultindicates thetwo samples differentin U that are terms theiraverage of ranks. Phrasing Results That Are Significant Our example above significant, we couldstate following: is so the A Mann-Whitney Utestwascalculated the examining place thatrunners with varyinglevelsof experience took in a long-distance Runners race. with no experience significantly did (m worse place: 6.50)thanrunners with a lot of : (m experience place 2.5A;U = 0.00, < .05). p Phrasing Results ThatAre Not Significant If we conduct analysis the short-distance instead the long-distance the on race of race, will getthefollowingresults, we whicharenot significant.
Ranks
T..t St tlttlct
shorl

short

exDerience .00 2.00 Total

4 4 I

MeanRank Sumof Ranks 4.63 18.50 4.38 17.50

Manrr-wtltnoy u Wilcoxon W

z
Asymp. Sig. (2-tail6d) Exact Sig. [2'(1-tailed

/.cuu 17.500 -.145 .885


.886
t

sis.)l

a. Not corscled for ties. b GrouprngVanabb: sxporience

92

Chapter7 NonparametricInferentialStatistics

Therefore,we could statethe following: A Mann-Whitney U test was used to examine the difference in the race performance of runners with no experience and runners with a lot of race.No significantdifferencein the resultsof experience a short-distance in (U :7.50,p > .05).Runnerswith no experience averaged the racewas found 4.38. with a lot of experience averaged a placeof 4.63. Runners Practice Exercise scoresin PracticeExercise I (Appendix B) are measAssumethat the mathematics (< ured on an ordinal scale.Determineif youngerparticipants 26) have significantly lower mathematics scoresthan older participants.

Section7.4 Wilcoxon Test Description (dependequivalent the paired-samples The Wilcoxontestis the nonparametric of The samples from the samedistribution. are ent) t test.It testswhether not two related or I Wilcoxontestis weaker thanthe independenttest,so the I testshouldbe usedif you can meetits assumptions. Assumptions in The The Wilcoxontestis based thedifference rankings. datafor the two samon plesmustbe at leastordinal. Thereareno assumptions about shape thedistribution. of the SP^SS Data Format represents dependent variable at The testrequires variables. variable two One the represents dependent varivariable.The othervariable the onelevelof the independent ableat thesecond variable. levelof theindependent Running the Command
lrnil$ re*|r u*t

Locate the command by clicking Analyze, then Nonparametric Tests,then 2 Related Samples.This example uses the RACE.savdataset.

Crrd*!

'.;,'{' sd*

;l ;
*!!fd...

--f---""

This will bring up the diafor the


r r r :t^^- .^

o*sqr'.., '

ffi

I log box

-35+i ;

li*.1;X.**".. I Note the similaritv hetween it and

| wilcoxon

test.
lcucf8drcdomivti*r, , Vrnra2

m
- " -"-l * - --f l.dltf. --: ip ulte! sip l- xrNdil : ',I

9$l ,Hrl

.W

-.lggg*g,lggj=l

tne dialog box

for the dependentt test. If you have trouble,

(pairedreferto Section on the dependent 6.4 6. samples)testin Chapter I

oetm..I

93

Chapter7 NonparametricInferentialStatistics

Transfer variables the LONG andMEDIUM as a pair andclick OK to run the test. This will determine the runnersperformequivalently long- and medium-distance if on races. Readingthe Output for statistics thetwo The outputconsists two parts. of The first partgivessummary variables. second The section contains resultof theWilcoxontest(givenas4. the
Ranks Mean Rank 4
a

TestStatisticsb Sumof Ranks 21.50 23.50


MEDIUM LONG

N MEUTUM NegaUVe LONG Ranks Positive Ranks Ties Total A . M ED IU M L ON G < b. tr,lgOtUtr,it> LONG c . LON G = M ED IU M

5.38 4.70

z
Asymp. sig. (2-tailed)

-1214 .904

3c 12

a' Based on negative ranks. b'wilcoxon Ranks Signed Test

was the The example above shows difference foundbetween rethatno significant sultsof the long-distance medium-distance and races. Phrasing Results ThatAre Significant A significant the resultmeans between two measurethat a change occurred has ments. thathappened, couldstate following: If we the A Wilcoxon test examined resultsof the medium-distance longand the (Z p distance races. significant A difference foundin theresults = 3.40, < was .05).Medium-distance results. results werebetter thanlong-distance Notethatthese results fictitious. are Phrasing Results ThatAre Not Significant the In fact,the results the example so in werenot significant, we couldstate above following: A Wilcoxon test examinedthe resultsof the medium-distance longand (Z:4.121, distance races. significant No was difference foundin theresults p > .05).Medium-distance from longdifferent results werenot significantly distance results. Practice Exercise Use the RACE.savdata file to determine whetheror not the outcomeof shortyour results. races. Phrase distance races different is from thatof medium-distance
94

Chapter7 NonparametricInferentialStatistics

Section7.5 Kruskal-WallisIl Test Description


The Kruskal-Wallis .F/ test is the nonparametric equivalent of the one-way samplescome from the samepopulaANOVA. It testswhetheror not severalindependent tion. Assumptions there are very few assumptions. However, the Becausethe test is nonparametric, for test doesassume ordinal level of measurement the dependentvariable. The indean pendentvariable shouldbe nominal or ordinal. ,SPS,S Data Format the to SPSSrequires one variableto represent dependentvariable and another represent levelsof the independentvariable. the Running the Command This exampleusesthe RACE.savdatafile. To run ffi the command,click Analyze, then Nonparametric Tests, Co,rpftlbfF This will bring up the main | 6;?Cl"tsilodd then K Independent Samples. Canl*t i dialog box. Ratatdan i Enter the independentvariable (EXPERIENCE) ; d # r t 0d! ildrdoft as the Grouping Variable, and click Define Range to de- Ii scdi fine the lowest (0) and highest(2) values.Enter the de- l@ Sarbr J thr pendent variable (LONG) in the Test VariableList, and ) qr.nr(o.*rd I nOCArs,., click OK.

) r I '

I I
('$5*|..,. l

ffudd,,. RrJf.,. l-Samda(-5.,.

2 (nC*dsdd.r.,,

Rangp Em{*tEVrri*lt for Mi{run Mrrdlrrrn l0 la

t c"{h.I +ry{ |
Hdp I

Reading the Output The output consists of two parts. The first part gives summary statisticsfor each of the groups defined by the groupvariable. ing (independent) rong .uu 1.00 2.00 Total

Ranks exoerience
N

MeanRank 4 4 4 12

1 0. 5 0 6.50 2.50

95

Chapter7 NonparametricInferentialStatistics

part of the outputgivesthe results The second value, of the Kruskal-Wallis (givenas a chi-square test but we will describe as an II).The example it hereis a significant valueof 9.846.

Test Statisticf,b
lonq

unr-uquare
df Asymp. Sig.

9.846 2 .007

a' KruskalWallis Test b. Grouping experience Variable:

Drawing Conclusions that Like the one-wayANOVA, the Kruskal-Wallis assumes the groupsare test is equal. Thus,a significant resultindicates at leastoneof the groups differentfrom at that thereareno ophowever, leastone othergroup.Unlike the One-llayANOVAcommand, tionsavailable post-hoc for analysis. Phrasing Results ThatAre Significant Theexample the above significant, we couldstate following: is so A Kruskal-Wallis the test was conducted comparing outcomeof a longA distancerace for nmnerswith varying levels of experience. significant resultwas found(H(2): 9.85,p < .01),indicating that the groupsdiffered from eachother.Runners a of with no experience averaged placement 10.50, with a lot of while runners with someexperience 6.50 averaged andrunners the experience averaged 2.50. The more experience runnershad, the better theyperformed. Phrasing Results ThqtAre Not Significant race, If we conducted analysis the of usingthe results the short-distance we would getthefollowingoutput, whichis not significant.
Ranks exoenence
N Teet Statlstlc$'b

snon .uu 1.00 2.00 Total

4 4 4 12

MeanRank 6.3E 7.25 5.88

short
unFuquare df Asymp.Sig.
.299

2
.861

a. Kruskal Wallis Test b. Grouping Variable: experience

This resultis not significant, we couldstate following: the so A Kruskal-Wallistest was conducted comparingthe outcomeof a shortNo distance race for runnerswith varying levelsof experience. significant > .05),indicating the groups did that difference found(H(2):0.299,p was averaged not differ significantly with no experience from eachother.Runners 7.25 and averaged a placement 6.38,while nmners of with someexperience to did with a lot of experience nmners 5.88.Experience not seem averaged influence results the short-distance race. the of

96

Chapter7 NonparametricInfbrentialStatistics

Practice Exercise Use Practice Data Set 2 in AppendixB. Job classification ordinal (clerical< is < technical professional). Determine males females if and havedifferinglevelsofjob clasyourresults. sifications. Phrase Section7.6 Friedman Test Description The Friedman is thenonparametric test equivalent a one-way of repeated-measures ANOVA. It is usedwhenyou havemorethantwo measurements related participants. from Assumptions The testuses rankings thevariables, the datamustbe at leastordinal. No the so of otherassumptions required. are Data Format SP,SS SPSSrequires leastthreevariables the SPSS in at datafile. Eachvariablerepresents dependent the variableat oneof thelevelsof theindependent variable. Running the Command Locatethe command clickingAnalyze,then by Nonparametric Tests, This will thenK Related Samples. bringup themaindialogbox.

f r;'ld{iv

f* Cdi#ln

Placeall the variables representing levelsof the independent the variable in the TestVariables area. this example, the RACE.sav For use datafile andthevariables LONG, MEDIUM. andSHORT. Click O1(.

97

Chapter7 NonparametricInferentialStatistics

Readingthe Output The outputconsists two sections. first section givesyou summary statistics of The for eachof the variables. second The of section the outputgivesyou the results the test of as a chi-square value. The examplehere has a value of 0.049 and is not significant (Asymp. Sig.,otherwise knownasp,is .976, than.05). whichis greater
Ranks Mean Rank
LUNU

Test Statisticf Chi-Square | 12 .049 .976

MEDIUM SHORT

2.00 2.04 1.96

d f lz
Asymp. Sis. |

a. Friedman Test

Drawing Conclusions The Friedmantest assumes that the threevariablesare from the samepopulation.A significantvalue indicatesthat the variablesare not equivalent.

PhrasingResults ThatAre Significant


If we obtained a significant result, we could state the following (these are hypotheticalresults): A Friedman test was conductedcomparingthe averageplace in a race of nrnnersfor short-distance, races.A sigmedium-distance, long-distance and < .05). The length of the race nificant differencewas found C@: 0.057,p significantlyaffectsthe resultsof the race.

PhrasingResults Are Not Significant That


In fact,theexample above wasnot significant, we couldstate following: so the A Friedman was conducted placein a raceof test the comparing average runnersfor short-distance, races.No medium-distance, long-distance and = 0.049, > .05).The lengthof the significant p difference was founddg racedid not significantly affect results therace. of the Practice Exercise Usethedatain Practice DataSet3 in Appendix If anxietyis measured an oron B. dinal scale, your results. determine anxietylevelschanged if overtime.Phrase

98

Chapter8

.fi
i$

't, s
j

TestConstruction
Section 8.1 ltem-Total Analvsis Description Item-totalanalysisis a way to assess internal consistencyof a data set. As the such,it is one of many testsof reliability. Item-totalanalysis comprises numberof items a (e.g.,intelligence), that make up a scaleor testdesigned measure singleconstruct to a and determines degreeto which all of the items measure sameconstruct.It doesnot tell the the you if it is measuringthe correctconstruct(that is a questionof validity). Before a test can be valid, however,it must first be reliable. Assumptions All the itemsin the scaleshouldbe measured an interval or ratio scale.In addion If you can tion, eachitem shouldbe normallydistributed. your itemsare ordinal in nature, rho correlationinsteadof the Pearsonr correlaconduct the analysisusing the Spearman tion. Data Forrnat SP.SS SPSSrequiresone variablefor each item (or I Andyle Grrphs Ulilities Window Help question)in the scale.In addition,you must have a RPpo*s , Ds$rripllvo Statirlicc ) variablerepresenting total scorefor the scale. the

iilrliililrr'
Vuiabla*

F Ccmpar*Msans GtneralLinear Madel I

.q ! l

W Riqression

I"1l J13l
, Condation Coalfubr*c l- Kcndafsta* l|7 P"as* f- Spoamm

!:*l', ."

Conducting the Test

Item-total analysisuses the . ry1 Pearson Coruelation command.

To conduct it, open the data file you crel * * - * _ _ . -_ ^ _ _ QUESTIONS.sav , Tsd ot SlJnific&cs ated in Chapter 2. Click Analyze, '. f ona.{atud I rr r'oorxaa I then Coruelate, then Bivariate. Place all questionsand the |7 nag ris{icsitqorrcldimr total in the right-hand window, and click OK. (For more help on conductingconelations,seeChapter5.) The total can be calin culatedwith the techniques discussed Chapter2.

99

Chapter8 Test Construction

Readingthe Output The output consistsof a correlation matrix containing all questions and the total. Use the column labeledTOTAL, and locate the correlationbetweenthe total scoreand each question. In the exampleat right, QuestionI hasa correlation 0.873with the of totalscore. 2 Question hasa correlation of -0.130 with the total. Question3 has a correlationof 0.926with thetotal. Interpreting the Output correlaItem-total If correlations shouldalwaysbe positive. you obtaina negative tion, that question shouldbe removedfrom the scale(or you may considerwhetherit should reverse-keyed). be desirable. Generally,item-totalcorrelations greaterthan 0.7 are considered of Thoseof lessthan 0.3 are considered of with correlations lessthan weak.Any questions 0.3 should removed be from thescale. After thetoNormally,theworstquestion removed, is thenthetotal is recalculated. tal is recalculated, item-totalanalysis repeated that was rethe without the question is moved. Then,if anyquestions lessthan0.3,the worstoneis removed, havecorrelations of andtheprocess repeated. is When all remaining correlations greater than 0.3, the remainingitems in the are scale considered be those are to thatareinternally consistent. Section8.2 Cronbach's Alpha Description it Cronbach's As alphais a measure internalconsistency. such, is oneof many of testsof reliability. Cronbach's alphacomprises numberof itemsthat makeup a scale a designed measure singleconstruct (e.g.,intelligence), determines degree the to to a and whichall the itemsaremeasuring same It the construct. doesnot tell you if it is measuring (thatis a question validity). Beforea testcanbe valid,however, it the correct construct of mustfirst be reliable. Assumptions In All the itemsin thescale on should measured an interval or ratio scale. addibe tion, eachitem should normallydistributed. be
Correlatlons

o1
l'oarson \,orrelallon Sig. (2-tailed) N PearsonCorrelalion UZ Si9.(2-tailed) N ParsonCorrelation Q3 Sig.(2-tailed) N TOTAL PearsonConelation Sig.(2-tailed) N Lll 1.000 4 -.447 .553 4

o2
cll

Q3
t to

TOTAL
o/J

.553
4

.282
4

.127
4

1.000 4

-.229 .771
4

30 - .1 .870 4
.YZ O

.718 .?82
4

..229 .771 4
30 -.1 .870 4

1.000 4 .926 .074 4

.873 .127 4

.074 4 1 .000 4

100

Chapter8 Test Construction

Data Format ,SP,SS in for requires variable eachitem(or question) the scale. one SPSS Running the Command This example uses the QUESTin IONS.sav datafile we first created Chapter then Reliability 2. Click Analyze,thenScale, Analysis. This will bring up the maindialogbox for Reliability Analysis.Transferthe questions from your scaleto the Itemsblank,and repre' any click OK. Do not transfer variables senting total scores.
Udtbr Wh&* tbb )
f:

l,
it.

) )

ql

q3

the Notethatwhenyou change oPof measures tionsunderModel,additional (e.9.,split-half) can internal consistency be calculated.

Readingthe Output is the In this example, reliability coefficient 0.407. closeto Numbers closeto 1.00are very good,but numbers poorinternal consistency. 0.00represent Section8.3 Test-RetestReliability Description of reliabilityis a measure temporal stability. As such,it is a measure Test-retest to that of measures internal consistency tell you theextent whichall of reliability. Unlike of temporal measures the that of the questions makeup a scalemeasure sameconstruct, is over you whether not the instrument consistent time and/orovermultiple or stability tell administrations. Assumptions be The total scorefor the scaleshould an interval or ratio scale.The scalescores be should normallydistributed.
Stntistics RelLrltility Cronbach's N of ltems Aloha 3 .407

l 0l

Chapter8 Test Construction

.SP,S,S Format Data requires variable SPSS a representing totalscore the scale the timeof first for at the participants a administration. second A variable for at representing total score the same the different time (normally two weekslater)is alsorequired. Running the Command The test-retest reliabilitycoefficient simplya Pearson coefficient for is correlation the relationship To between total scores the two administrations. compute coefthe the for (Chapter ficient,follow the directions computing Pearson 5, for coefficient a correlation Section 5.1).Usethetwo variables representing two administrations thetest. of the Readingthe Output The correlation between two scores the test-retest reliabilitycoeffrcient. It the is should positive. be reliability is indicated values Weakreliability Strong closeto 1.00. by is indicated values by close 0.00. to Section 8.4 Criterion-Related Validitv Description Criterion-related validity determines extentto which the scaleyou are testing the correlates with a criterion.For example, highly with GPA. If ACT scores shouldcorrelate theydo, thatis a measure validity forACT scores. theydo not,thatindicates ACT of that If scores maynot be valid for theintended purpose. Assumptions All of the same assumptions the Pearson applyto measfor correlation coefficient uresof criterion-related validity(intervalor ratio scales, normal distribution,etc.). .SP,S,S Format Data you Two variables required. are for Onevariable represents total score the scale the aretesting. Theotherrepresents criterion you aretesting against. the it Running the Command Calculating criterion-related validity involvesdetermining Pearson the correlation valuebetween scale thecriterion. Chapter Section for complete the and informa5.1 See 5, tion. Readingthe Output The correlation between two scores the criterion-related validity coefficient. the is It should positive. be Strong validity is indicated values Weakvalidity is closeto 1.00. by indicated values by close 0.00. to

102

AppendixA

Effect Size
Many disciplines are placing increasedemphasison reporting effect size. While statisticalhypothesistestingprovidesa way to tell the odds that differencesare real, effect sizesprovide a way to judge the relative importanceof thosedifferences.That is, they tell us the size of the difference relationship. They are alsocritical if you would like to estior matenecessary samplesizes,conducta power analysis, conducta meta-analysis. or Many professional (e.g.,the AmericanPsychological organizations Association) now requirare ing or strongly suggesting that effect sizesbe reportedin addition to the resultsof hypothesis tests. Becausethere are at least 4l different types of effect sizes,reach with somewhat different properties,the purposeof this Appendix is not to be a comprehensive resourceon effect size, but ratherto show you how to calculatesomeof the most common measures of effectsizeusingSPSS15.0. Co h e n ' s d One of the simplest and most popular measures effect size is Cohen's d. of Cohen's d is a memberof a classof measurements called"standardized meandifferences." In essence, is the difference d the between two meansdivided by the overall standard deviation. It is not only a popularmeasure effect size,but Cohenhasalso suggested simof a ple basisto interpretthe value obtained. Cohen'suggested effect sizesof .2 aresmall, that .5 are medium,and .8 are large. We will discussCohen's d as the preferredmeasure effect size for I tests.Unforof tunately,SPSSdoes not calculateCohen's d. However,this appendixwill cover how to calculateit from the output that SPSSdoesproduce.

EffectSizefor Single-SampleTests t
Although SPSSdoesnot calculate effect size for the single-sample test,calculat/ ing Cohen's d is a simplematter.

' Kirk, R.E. (1996). Practicalsignificance:A conceptwhose time has come. Educational & Psychological Measurement, 7 46-7 59. 56, ' Cohen,J. (1992). A power primer. PsychologicalBulletin, t t 2, 155-159.

103

Appendix A Effect Size

T-Test
Om'S$da sltb.tr

std
xoan
LENOIH

Dflalon

Std Eror Ygan

16 qnnn

| 9?2

I Cohen's d for a single-sample test is equal to the mean differenceover the standard deviation. If SPSSprovidesus with the following output,we calculated as indicatedhere:
d95$ Conld6nca lhlld.t ottho

D
sD

o'r..Srt!-

Iad

= TsstValuo 35

slg (2-lailsd)
I

Igan Dill6f6nao

uo00r
I l 5 5 F- 0 7

71r'

sooo

.90 t.t972 d = .752


d-

In this example,using Cohen's guidelinesto judge effect size, we would have an effect size betweenmedium and large.

EffectSize Independent-Samples t Tests for Calculating effectsizefromtheindependent output a little morecomplex is t test
because SPSSdoesnot provideus with the pootedstandard deviation. The uppersection of the output, however, does provide us with the information we need to calculateit. The output presented here is the sameoutput we worked with in Chapter6.
Group Statlltlcr morntno No yes

graoe

z 2

Mean Std. Oeviation J.JJIDJ UZ.SUUU 7.07107 78.0000

Std.Error Mean
Z,SUUUU

s.00000

S pooled =

(n ,-1 )s , 2+ (n , -l)s r2 n r+ n r-2

rp o o te d-\ /12

+ | - tlr.Sr552 (2 -t)7.07 (
2+ 21

S pool"d

= Spooted 5'59

Once we have calculated pooled standard deviation (spooud), can calculate we the Cohen'sd.
Q=- -

, x ,-x ,
S pooled

82.50 78.00 5.59 =.80 d , o =-

104

Appendix A

Effect Size

So, in this example,using Cohen's guidelinesfor the interpretationof d, we would have obtaineda large effect size.

EffectSizefor Paired-SamplesTests t
As you have probably learned in your statisticsclass, a paired-samples test is I just a specialcaseof the single-sample test.Therefore, procedure calculatreally I the for ing Cohen's d is also the same.The SPSSoutput,however,looks a little different,so you will be taking your valuesfrom different areas.
PairdSamplos Test
Paired Differences 95% Confidence Intervalof the

std.
Mean YAlt 1 |'KE ttss I . FINAL
f)eviatior A OTqA

Std.Error Mean

f)iffcrcnne

sig
Upper
11.646 df (2-tailed)

Lower

-22.809s

1.9586 -26.8952 -18.7239

20

.000

u --

,D
JD

- 22.8095
8.9756 d = 2.54 Notice that in this example,we representthe effect size (d; as a positive number eventhoughit is negative. Effect sizesare alwayspositivenumbers. this example, In using Cohen'sguidelinesfor the interpretation d, we have found a very large effect size. of 12(Coefficient of Determination) While Cohen's d is the appropriatemeasureof effect size for / tests, correlation and regressioneffect sizes should be determinedby squaringthe correlation coefficient. This squaredcorrelationis called the coefficient of determination. Cohen' suggested here that correlations .5, .3, and.l corresponded large,moderate, of to and small relationships. Thosevaluessquared yield coefficients determination of .25,.09, and .01 respectively. of It would appear,therefore,that Cohen is suggestingthat accounting for 25o/o the variof ability represents large effbct, 9oha moderate a effect, and lo/oa small effect.

EffectSize Correlation for


Nowhere is the effect of sample size on statisticalpower (and therefore significance) more apparentthan with correlations. Given a large enoughsample,any correlation can becomesignificant.Thus, effect size becomescritically important in the interpretation of correlations.

power analysisfor the behavioral sciences (2"d ed). New Jersey:Lawrence 'Cohen, J. (1988). Statistical Erlbaum.

105

Appendix A

Effect Size

The standardmeasureof effect size for correlationsis the coefficient of determination (r2) discussedabove. The coefficient should be interpretedas the proportion of variance in the dependent variable that can be accounted by the relationshipbetween for the independentand dependentvariables.While Cohenprovidedsomeusefulguidelines for interpretation, each problem should be interpretedin terms of its true practical significance. For example,if a treatmentis very expensiveto implement,or has significant side effects,then a larger correlationshould be required before the relationshipbecomes"important." For treatments that are very inexpensive, much smaller correlationcan be cona "important." sidered To calculatethe coefficientof determination, simply take the r value that SPSS providesand squareit.

EffectSize Regression for TheModelSummary section of the output reports R2 for you. The example output here shows a coefficient of determination of .649, meaningthat almost 65% (.649\ of the variability in the dependentvariable is accountedfor by the relationship betweenthe dependent and independentvariables.

ModelSummary
Adjusted R R Souare R Souare .8064 .649 .624 Std. Error of the Estimate 16.1 480

Model

a. Predictors: (Constant), HEIGHT

Eta Squared Qtz)


A third measure effect size is Eta Squared (ry2). of Eta Squared is usedfor Analysis of Variancemodels.The GLM (GeneralLinear Model) functionin SPSS(the function that runs the procedures underAnalyze-General Linear Model) will provide Eta Squared (tt'). Eta Squared has an interpretation corsimilar to a squared SS"ik", relationcoefficient1l1.lt represents proportionof the variance n, the tt accountedfor by the effect.-Unlike ,2, ho*euer, which represents Sqr," . SS* onlylinearrelationships,,72canrepresentanytypeofrelationship.

Effect Sizefor Analysis of Variance For mostAnalysisof Variance you problems, shouldelectto reportEta Squared asyour effectsizemeasure. provides calculation you as part of the General SPSS for this LinearModel(GLIrl)command. To obtainEta Squared, you simplyclick on theOptions box in the maindialog box for the GLM command arerunning(this worksfor Univariate, you Multivariate, and Repeated Measures versions the command of eventhoughonly the Univariate option is presented here).

106

Appendix A Effect Size

Onceyou haveselected Options, new a dialog box will appear. One of the optionsin that box will be Estimates ffict sze. When of you select that box, SPSSwill provide Eta Squaredvalues partofyour output. as

l-[qoral* |- $FdYr|dg f A!.|it?ld

rli*dr
f'9n drffat*l Cr**l.|tniliivdrra5:

tl

l c,,*| n* |
Testsot EetweerF$iltectsEftcct3

Dependenl Variable: score Type Sum lll nf Sdila!es df tean Souare Source u0rrecleo M00el 1 0 .4 5 0 r 5.225 2 4 Inlercepl I s't.622 91.622 1 gr0up 5.225 1 0 .4 5 0 Enor 12 .27 4 3 .2 8 3 Total 15 1 0 5 .0 0 0 't4 Corrected Total 1 3 .7 3 3 a. R Squared .761 = = (AdiusiedSquared .721) R

Padial Eta
F

19.096 331.862 1S .096

si q, .000 .000 .000

Souared

761 965 761

In the example here,we obtained Eta Squaredof .761for our main effectfor an groupmembership. Eta Because interpret Squaredusingthe same we guidelines ,', we as wouldconclude thisrepresents that alargeeffectsizefor groupmembership.

107

Appendix B PracticeExerciseData Sets

PracticeData Set2
A survey of employeesis conducted. Each employeeprovides the following infor(SALARY), Years of Service (YOS), Sex (SEX), Job Classification mation: Salary (CLASSIFY), and Education Level (EDUC). Note that you will haveto code SEX (Male: : : l, Female: 2) and CLASSIFY (Clerical: l, Technical 2, Professional 3).

ll*t:g - Jrs$q Numsric


lNumedc

SALARY 35,000 18,000 20,000 50,000 38,000 20,000 75,000 40,000 30,000 22,000 23,000 45,000

YOS 8 4 I 20 6 6 17 4 8 l5 16 2

SEX Male Female Male Female Male Female Male Female Male Female Male Female

CLASSIFY EDUC 14 Technical l0 Clerical Professional l6 Professional l6 20 Professional 12 Clerical 20 Professional 12 Technical 14 Technical 12 Clerical 12 Clerical Professional l6

PracticeData Set3
(CONDITION). Participants who havephobias given one of threetreatments are (ANXPRE), Their anxietylevel (1 to l0) is measured threeintervals-beforetreatment at (ANX4HR). (ANXIHR), and againfour hoursafter treatment one hour after treatment Notethatyou will haveto codethevariable CONDITION.

4lcondil 7l

,Numeric i'- '* '"

ll0

Appendix B Practice Exercise Data Sets

ANXPRE 8

t0
9 7 7 9 l0 9 8 6 8 6 9 l0 7

ANXIHR 7 l0 7 6 7 4 6 5 3 3 5 5 8 9 6

ANX4HR 7 l0 8 6 7 5 I 5 5 4 3 2 4 4 3

CONDITION Placebo Placebo Placebo Placebo Placebo ValiumrM ValiumrM ValiumrM ValiumrM ValiumrM Experimental Drug Experimental Drug Experimental Drug Experimental Drug Experimental Drug

lll

Appendix C

Glossary
everypossible outcome. All Inclusive.A set of eventsthat encompasses normally showing that there Alternative Hypothesis. The oppositeof the null hypothesis, is a true difference. Generallv. this is the statementthat the researcherwould like to support. Case ProcessingSummary. A sectionof SPSSoutput that lists the number of subjects usedin the analysis. Coefficient of Determination. The value of the correlation, squared. It provides the proportionof varianceaccounted by the relationship. for Cohen's d. A common and simple measureof effect size that standardizes difference the betweengroups. Correlation Matrix. A section of SPSS output in which correlationcoefficientsare reported allpairs of variables. for variable,but not treatedas an Covariate. A variable known to be relatedto the dependent independent variable.Used in ANCOVA as a statisticalcontrol technique. Data Window. The SPSSwindow that containsthe data in a spreadsheet format. This is the window usedfor running most commands. Dependent Variable. An outcome or response variable. The dependent variable is variable. normally dependent the independent on Descriptive Statistics. Statisticalprocedures that organizeand summarizedata. $ $ nialog Box. A window that allows you to enter information that SPSS will use in a command.

* il I 'f
I'

(e.g., gender). Oichotomous with Variables. Variables onlytwo levels DiscreteVariable. A variable that can have only certainvalues(i.e., valuesbetween *hich there no score, A, B, C, D, F). is like
Effect Size. A measurethat allows one to judge the relative importanceof a differenceor relationshipby reportingthe size of a difference. Eta Squared (q2).A measure effectsizeusedin Analysisof Variancemodels. of

i .

I 13

Appendix Glossary C

Grouping Variable. In SPSS, variableusedto represent group membership. the SPSS often refersto independent variables groupingvariables; SPSSsometimes refersto as grouping variables independent as variables. Independent Events.Two events independent information are if aboutoneeventgivesno information aboutthe second event(e.g., two flips of a coin). Independent Variable.Thevariable whose the levels(values) determine groupto whicha subjectbelongs.A true independent variable is manipulated the researcher. by See Grouping Variable. Inferential Statistics.Statistical procedures designed allow the researcher draw to to inferences abouta population thebasis a sample. on of Interaction.With morethanone independent whena level variable, interaction an occurs of oneindependent variable affects influence another variable. the independent of Internal Consistency. reliabilitymeasure assesses extentto which all of the A that the itemsin an instrument measure same the construct. Interval Scale. A measurement scale where items are placed in mutually exclusive categories, with equal intervalsbetweenvalues.Appropriatetransformations include counting, sorting, addition/subtraction. and Levels.The values thata variable have.A variable with threelevelshasthreepossible can values. Mean.A measure central of tendency zero. where sumof thedeviation scores equals the Median.A measure central of tendency whenthe representing middleof a distribution the dataaresorted from low to high.Fifty percent thecases belowthe median. are of Mode. A measure centraltendency of representing value (or values)with the most the (the subjects score with thegreatest frequency). Mutually Exclusive. Two events are mutually exclusivewhen they cannot occur simultaneously. Nominal Scale. A measurement scalewhere items are placed in mutually exclusive categories. Differentiation by nameonly (e.g.,race,sex).Appropriate is include categories "same"or "different. Appropriate " transformations include counting. NormalDistribution. symmetric, A unimodal, bell-shaped curve. Null Hypothesis.The hypothesis be tested,normally in which there is no tnre to difference. is mutuallyexclusive thealternative It of hypothesis.

n4

Appendix Glossary C Ordinal Scale. A measurementscale where items are placed in mutually exclusive categories, in order. Appropriate categories include "same," "less," and "more." include countingand sorting. Appropriatetransformations Outliers. Extreme scoresin a distribution. Scoresthat are very distant from the mean and the rest of the scoresin the distribution. the The left side Output Window. The SPSSwindow that contains resultsof an analysis. summarizes resultsin an outline. The right side containsthe actual results. the of Percentiles(Percentile Ranks). A relative scorethat gives the percentage subjectswho scoredat the samevalue or lower. the Pooled Standard Deviation. A single value that represents standarddeviation of two groupsofscores. Protected Dependent / Tests. To preventthe inflation of a Type I error, the level needed when multipletestsare conducted. to be significantis reduced Quartiles. The points that define a distribution into four equal parts. The scoresat the 25th,50th,and 75th percentile ranks. Random Assignment. A procedurefor assigningsubjectsto conditionsin which each to subject hasan equalchance ofbeing assigned any condition. the Range. A measureof dispersionrepresenting number of points from the highestscore through the lowest score. Ratio Scale. A measurementscale where items are placed in mutually exclusive categories, with equal intervals between values, and a true zero. Appropriate transformations include counting,sorting,additiott/subtraction, multiplication/division. and of Reliability. An indication of the consistency a scale. A reliable scale is intemally consistent stableover time. and Robust. A test is said to be robust if it continuesto provide accurateresultseven after the violationof someassumptions. Significance. A difference is said to be significant if the probability of making a Type I error is less than the acceptedlimit (normally 5%). If a difference is significant, the null hypothesisis rejected. Skew. The extent to which a distribution is not symmetrical.Positive skew has outliers on (left) Negativeskew hasoutlierson the negative the positive(right) sideof the distribution. sideof the distribution. Standard Deviation. A measureof dispersion representinga special type of average deviation from the mean.

I 15

Appendix Glossary C

deviationfor a regression Standard Error of Estimate.The equivalent the standard of line pointswill be normallydistributed regression with a standard line.The data the around deviationequalto the standard errorof the estimate. with a meanof 0.0 anda standard StandardNormal Distribution. A normaldistribution deviation 1.0. of can Numericvariables String Variable. A stringvariable contain letters numbers. and can with stringvariables. contain only numbers. Most SPSS will commands not function that have determined Temporal Stability. This is achieved when reliability measures remainstable scores overmultipleadministrations the instrument. of purportedto reveal an "honestly significant Tukey's HSD. A post-hoccomparison (HSD). difference" rejectsthe null erroneously Type I Error. A Type I error occurswhen the researcher hypothesis. fails to rejectthe erroneously Type II Error. A Type II erroroccurs whenthe researcher null hypothesis. Valid Data.DatathatSPSS usein its analyses. will Validity. An indication theaccuracy a scale. of of Variance.A measure dispersion deviation. of standard equalto thesquared

l16

AppendixD

DataFilesUsedin Text Sample


this throughout text. Hereis a list A varietyof smalldatafiles areusedin examples of whereeachappears. COINS.sav Variables: Entered Chapter in 7 GRADES.sav Variables:

COINI COIN2

PRETEST MIDTERM FINAL INSTRUCT REQUIRED

Entered Chapter in 6 HEIGHT.sav Variables:

HEIGHT WEIGHT SEX

Entered Chapter 4 in QUESTIONS.Sav Variables: Ql 2) in Q2 (recoded Chapter Q3 2) in TOTAL (added Chapter (added Chapter 2) in GROUP 2 Entered Chapter in 2 Modifiedin Chapter

tt7

Appendix Sample D DataFilesUsedin Text

RACE.sav Variables:

SHORT MEDIUM LONG EXPERIENCE

Entered Chapter in 7 SAMPLE.sav Variables:

ID DAY TIME MORNING GRADE WORK TRAINING (added Chapter 1) in

Entered ChapterI in Modifiedin Chapter I SAT.sav Variables:

SAT GRE GROUP

Entered Chapter in 6 Other Files practice For some thatarenot usedin any exercises, Appendix for needed see datasets B otherexamples thetext. in

l18

Appendix E

Informationfor Users of EarlierVersions SPSS of


Thereare a numberof differences between SPSS15.0and earlierversions the of software. mostof themhavevery little impacton usersof this text. In fact, Fortunately, will be ableto successfully this text withoutneeding to mostusersof earlierversions use reference appendix. this Variable nameswere limited to eight characters. The Versions SPSS olderthan 12.0arelimitedto eight-character variable names. of you to othervariable name rulesstill apply.If you areusingan olderversion SPSS, need of you useeightor fewerletters yourvariable for makesure names.
The Data menu will look different. The screenshots the text where the Data in menu is shown will look slightly different if you are using an older version of SPSS.These missing or renamedcommandsdo not have any effect on this text, but the menusmay look slightly different. If you are using a version of SPSS earlier than 10.0,the Analyzemenu will be called Statistics instead.
Dsta TlrBfqm rn4aa getu t

r{ ril

oe&reoacs,.,
r hertYffd,e
l|]lrffl Clsl*t

GobCse",,

Graphing functions. functions SPSS Prior to SPSS12.0,the graphing of werevery limited.If you are using a version of SPSSolder than version 12.0, third-partysoftwarelike Excel or of for If SigmaPlot recommended theconstruction graphs. you areusingVersion14.0of is graphing. to 4, thesoftware, Appendix asan alternative Chapter whichdiscusses F use

l 19

Appendix E Information for Usersof Earlier Versionsof SPSS

Variableiconsindicatemeasurement type.
In versions of SPSS earlier than 14.0,variableswere represented in dialog boxes with their variable label and an icon that represented whether the variable was string or numeric (the examplehere shows all variablesthat were numeric). Starting with Version 14.0, SPSS shows additional information about each variable. Icons now representnot only whether a variable is numeric or not, but also what type of measurement scale it is. Nominal variables are represented the & by icon. Ordinal variables are representedby the dfl i.on. Interval and ratio variables(SPSSrefers to them as scale variables) are represented bv the d i"on.

f Mlcrpafr&nlm /srsinoPtrplecnft / itsscpawei [hura /v*tctaweir**&r /TimanAccAolao U 1c $ Cur*ry Orlgin I ClNrmlcnu clrnaaJ dq$oc.t l"ylo.:J
It liqtq'ftc$/droyte

sqFr*u.,| 4*r.-r-l qql{.,I


Several SPSS data filescan now be openat once.
Versionsof SPSSolder than 14.0 could have only one data file open at a time. Copying data from one file to another entailed a tedious process of copying/opening files/pastingletc.Starting with version 14.0, multiple data files can be open at the same time. When multiple files are open,you can selectthe one you want to work with using the Windowcommand.

md*

Heb

t|sr
$$imlzeA[ Whdows lCas,sav [Dda*z] - S55 DctoEdtry

t20

F Appendix

13.0and14.0 DatawithSPSS Graphing


to This appendix shouldbe usedas an alternative Chapter4 whenyou are usprocedures may alsobe usedin SPSS 15.0, deif ing SPSS 13.0or 14.0. These Legacy Dialogsinsteadof ChortBuilder. sired,by selecting GraphingBasics
In addition to the frequency distributions,the measuresof central tendency and in measures dispersiondiscussed Chapter3, graphing is a useful way to summarize,orof ganize,and reduceyour data. It has been said that a picture is worth a thousandwords. In the caseof complicated datasets,that is certainlytrue. With SPSSVersion 13.0 and later,it is now possibleto make publication-quality of graphsusing only SPSS.One importantadvantage using SPSSinsteadof other software (e.9., Excel or SigmaPlot)is that the data have alreadybeen entered. to createyour graphs Thus, duplicationis eliminated,and the chanceof making a transcriptionerror is reduced. Editing SP,S,S Graphs Whatever command you use to create your graph, you will probably want to do some editing to make it look exactly the way you want. In SPSS,you do this in much the sameway that you edit graphsin other software programs (e.9., Excel). In the output window, select your graph (thus creating handles around the outside of the entire object) and righrclick. Then, click SPSSChart Object, then click Open. Alternatively, you can double-click on the graph to open it for editing.
tl 9:.1tl 1bl rl .l @l kl D l ol rl : j

8L
l
l

t21

AppendixF GraphingDatawith SPSS13.0and 14.0

window and the correspondingProperties window will appear.

Whenyou openthe graphfor editing,theChartEditor HSGlHffry;'

I* -l

ql ryl

OnceChartEditoris open, you caneasilyedit eachelement the graph. select of To just click on therelevant an element, spoton the graph. example, select element For to the representing title of the graph,click somewhere the title (the word "Histogram" the in on theexample below).

Once you have selected element,you can tell that the correctelementis selected an because will havehandlesaroundit. it If the item you have selectedis a text element(e.g.,the title of the graph),a cursor will be presentand you can edit the text as you would in word processing programs.If you would like to changeanotherattributeof the element(e.g., the color or font size), use the Properties box (Text propertiesare shownabove).

t22

Appendix Graphing F Datawith SPSS 13.0and 14.0

graphs you can makeexcellent With a little practice, way you want usingSPSS. the Onceyour graphis formatted it, simplyselect File,thenClose. Data Set we For the graphing examples, will usea new set of data.Enterthe databelowand savethe file as HEIGHT.sav. The data representparticipants' HEIGHT (in inches), WEIGHT(in pounds), SEX(l : male,2= female). and HEIGHT 66 69 73 72 68 63 74 70 66 64 60 67 64 63 67 65 WEIGHT 150 155 160 160 150 140 165 150 110 100 9 52 ll0 105 100 ll0 105 SEX r I I I l l I I 2 2 2 2 2 2 2

the by Checkthat you haveentered datacorrectly calculating mean for eachof a (click Analyze,thenDescriptiveStatistics, the threevariables thenDescriptives). Compare yourresults with those thetablebelow. in
Descrlptlve Statlstlcs

std.
N ntrt(,n I

Minimum Maximum 'to 16 16 16


OU.UU

WEIGHT SEX
Valid N (listwise)

95.00 1.00

Deviation Mean I1.U U oo.YJ/0 J .9U O/ 165.00 129.0625 26.3451 2.00 .5164 1.s000

123

AppendixF GraphingDatawith SPSS13.0and 14.0

Bar Charts,PieCharts,and Histograms Description


Bar charts,pie charts,and histograms the represent number of times each scoreoccurs by varying the height of a bar or the size of a pie piece.They are graphicalrepresentations of the frequencydistributionsdiscussed Chapter3. in Drawing Conclusions The Frequenciescommandproduces output that indicatesboth the number of cases particular value and the percentage caseswith that value. Thus, of in the sample with a for conclusionsdrawn should relate only to describing the numbers or percentages the perthe If regarding cumulative sample. the dataare at leastordinal in nature, conclusions centages and/orpercentilescan alsobe drawn. SPSS Data Format You needonlv one variableto usethis command. Running the Command The Frequenciescommand will produce I gnahao Qr4hs $$ities graphical frequency distributions.Click Analyze, Regorts then Descriptive Statistics,then Frequencies.You @ will be presented with the main dialog box for the : lauor Comps? U6ar1s Frequenciescommand, where you can enter the ' Eener.lllnEarlrlodsl variables for which you would like to create r'txedModds Qorrelate graphsor charts.(SeeChapter3 for other options with this command.) available

i ) ) ;

BrsirbUvss,;. Explorc.,, 9or*abr,.,


&dtlo...

tro ps8l
E
."-41
8"". I

Click the Chartsbuttonat the bottom to produce frequencydistributions. Charts This will give you the Frequencies: dialogbox.

C hartTl p- " .' ---* ,

c[iffi

I C " ,tt "5l


1

(e q,l s r ur ' : sachalts


l* rfb chilk

,i .,

. kj -i
:

r llrroclu* ,'
There are three types of charts under this command: Bar charts, Pie charts, andHistograms.For each type, the f axis can be either a frequency count or a (selected percentage through the Chart Valuesoption). You will receivethe chartsfor any variablesselectedin the main Frequencies commanddialog box.
124
.u-,, l- :+;:'.n ,,:,
rChatVdues' , 15 Erecpencias
l -, - - ::- ,-

,,,, I , j

'1
i
:

Porcer*ag$

AppendixF GraphingDatawith SPSS13.0and 14.0

Output
The bar chart consistsof a Y axis, representing the frequency, and an X axis, representing each score. Note that the only values represented the X on axis are those with nonzero frequencies(61, 62, and 7l are not represented).
hrleht

1.5

c a f s a L

1.0

66.00 67.00 68.00 h.llht

: ea t 61@ llil@ I 65@ os@ I r:@ o 4@ g i9@ o i0@ a i:@ B rro o :aa

The pie chart shows the percentageof the whole that is represented eachvalue. by

ifr

The Histogramcommand creates a grouped frequency distribution. range The of scores is split into evenly spaced groups.The midpoint of each group is plotted on the X axis, and the I axis represents numberof scoresfor each the group. If you selectLl/ithNormal Curve,a normal curve will be superimposed over the distribution.This is very useful for helping you determine the distribution if you haveis approximately normal. Practice Exercise

M.Sffi td. 0.v. . 1.106?: ff.l$ h{ghl

Use PracticeData Set I in Appendix B. After you have enteredthe data,constructa histogramthat represents mathematics the skills scoresand displaysa normal curve, and a bar chart that represents frequencies the variableAGE. for the

t25

AppendixF GraphingDatawith SPSS13.0and 14.0

Scatterplots Description
Scatterplots(also called scattergrams scatterdiagrams)display two values for or eachcasewith a mark on the graph.The Xaxis represents value for one variable.The / the axis represents value for the secondvariable. the Assumptions Both variablesshouldbe interval or ratio scales.If nominal or ordinal data are used,be cautiousaboutyour interpretation the scattergram. of .SP,t^t Data Format You needtwo variablesto perform this command. Running the Command
I gr"pl,u Stllities Add-gns :

You can produce scatterplotsby clicking Graphs, then I Scatter/Dot.This will give you the first Scatterplot dialog box. i Selectthe desiredscatterplot (normally,you will select Simple ! Scatter),then click Define.

m m

*t

xl

Simpla Scattal 0veday Scattel

ffi ffi

Mahix Scattel 3,0 Scatter

ffi sittpt" lelLl Oot

ll li Define -

q"rylJ
, Helo I

ffil;|rlil',r

m s.fd
3*J qej *fl

,-

| ,J l-,P*lb

9dMrk6by

IJ T-8off,

L*dqs.bx

rl[I

ti ...:

rI[I l*tCrtc - -' '

C*trK

This will give you the main Scatterplot dialog box. Enter one of your variablesas the I axis and the secondas the X axis. For example, using the HEIGHT.sav data set, enter HEIGHT as the f axis and WEIGHT as the X axis. Click

oK.

I'u;i** L":# .

126

Appendix Graphing 13.0 and 14.0 F Datawith SPSS

Output subject theappropriate andI levels. X at Theoutput consist a markfor each will of
74.00

r20.m

Adding a Third Variable Even though the scatterplotis a twograph,it canplot a third variable. dimensional in To makeit do so,enterthethird variable the SetMarkers by field.In our example,we will Markers by enterthe variable SEX in the ^Sel space. Now our outputwill havetwo different the sets of marks. One set represents male participants, the second represents the and set participants. female These two setswill appear You canuse in differentcolorson your screen. the SPSSchart editor to make them different shapes, in thegraphthatfollows. as fa, I e- l
at.l ccel

x*l

t27

AppendixF GraphingDatawith SPSS13.0and 14.0

Graph
sox aLm o 2.6
72.00

70.00

58.00

c .9

56.00

o
64.00

oo o

52.00

!00.00

120.m

110.@

r60.00

wrlght

Practice Exercise the UsePractice DataSet2 in Appendix Construct scatterplot examine relato B. a tionship between SALARY andEDUCATION. Advanced Bar Charts Description (seeChapter Section 4, You canproduce charts bar with theFrequencies command 4.3). Sometimes, however, are interested a bar chartwherethe I axis is not a frewe in quency. produce To sucha chart, need usetheBar Chartscommand. we to Data Format SPS,S At leasttwo variables needed performthis command. Thereare two basic are to kinds of bar charts-thosefor between-subjects and designs thosefor repeated-measures variableand designs. the between-subjects Use is method onevariable the independent if the otheris the dependent variable.Use the repeated-measures methodif you havea dependent variable for eachvalueof the G"dr tJt$ths Mdsns pkrv independent variable(e.g.,you wouldhavethreevariables a for IrfCr$We designwith three valuesof the independentvariable). This ) Map normallyoccurs whenyou takemultipleobservations time. over

128

]1

13.0and 14.0 Datawith SPSS Appendix Graphing F

Running the Command Bar Click Graphs,then for eithertypeof bar chart. dialog box. If you haveone This will openthe Bar Charts independentvariable, selectSimple.If you have more thanone,selectClustered. design, select If you are usinga between-subjects Summaries groups of cases.If you are using a refor peated-measures design, select Summariesof separate variables. graph,you measures If you are creatinga repeated overto will seethe dialog box below.Move eachvariable will placeit insidepaBars Represent area,and SPSS the rentheses followingMean.This will give you a graphlike the one below at right. Note that this exampleusesthe 6). 6.4 in data GRADES.sav entered Section (Chapter

lr# I t*ls o rhd

Practice Exercise the B. a DataSet I in Appendix Construct bar graphexamining relaUse Practice Hint: In the BarsRepresent and skills scores maritalstatus. mathematics tionshipbetween enterSKILL asthevariable. area.

t29

Você também pode gostar