Lab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 Correlation

© All Rights Reserved

3 visualizações

Lab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 CorrelationLab 3 Correlation

© All Rights Reserved

- Case Study 22 QA Business Quantitative Analysis
- Chapter 3 Methods and Procedures This Chapter
- Shirley Trash Analysis
- chapter4-correlation
- 8678Core Material Notes
- Adjustment among M.Tech students of Banasthali University
- Research Methodology
- QM1
- A Novel Single-Valued Neutrosophic Set Similarity Measure and Its Application in Multicriteria Decision-Making
- Syllabus-ITEMicrocontroller
- Arun Deepika Abhishek Kririka Jitendra
- 8 Correlations [Session 8]
- PHYSA_13574
- Correlation and Dependence - Wikipedia, The Free Encyclopedia
- Chapter 05
- Assumptions Summer2003
- SPSS to Word
- EJAP Sprint Lemaitre 2012
- practicum direct lesson plan
- Roychowduri Earnings Management Through Real Activities Manipulation

Você está na página 1de 9

Lab #3 Correlation

Defined: The measure of the strength and direction of the linear relationship between two

variables.

Variables: V is continuous! DV is continuous

"elationship: "elationship amongst variables

#$ample: "elationship between height and weight.

%ssumptions: &ormalit'. (inearit'.

1. Graphing - Scatterplot

The first step of an' statistical anal'sis is to first graphicall' plot the data. n terms of correlation! graphical

plots are called scatterplots. Scatterplots can show 'ou visuall' the strength of the relationship between the

variables! the direction of the relationship between the variables! and whether outliers e$ist.

)orrelation is the measure of the strength and direction of the relationship between the variables.

a. )orrelations can var' between *+ and +.

b. Direction of the relationship can be either positive or negative. % positive relationship is indicated b' a

positive value ,e.g.! ranging from 0 to +-. % negative relationship is indicated b' a negative value ,e.g.!

ranging from 0 to *+-. %n e$ample of a positive relationship is the relationship between height and

weight. The higher the outcome on one variable! the higher the outcome on the other variable. %n

e$ample of a negative relationship is the relationship between e$ercise and weight. The higher the

outcome on one variable! the lower the outcome on the other variable.

c. Strength of the relationship is measured from 0 to +.*+. The farther the value is awa' from 0! the

stronger the relationship. The appro$imate criteria for strength is 0 for no effect! .+ for a small effect! .

/ for a medium effect! and .0 for a large effect. &otice those values can be either positive or negative!

depending upon the direction of the relationship! so a .2 and *.2 relationship indicate the same strength!

but different direction.

1hat does a scatterplot loo2 li2e3 4elow are 5 scatterplots that show three e$amples of a positive relationship

in the top row ,perfect! strong! wea2-! three e$amples of a negative relationship in the middle row ,perfect!

strong wea2-! and three e$amples of no relationship.

,graph ta2en from bla website-

+

,graph ta2en from wi2ipedia-

6ow do graph a scatterplot3

+. Select Graphs **> Legacy Dialogs **> Scatter

2. )lic2 7Simple8! and 7Define8

/. 9ove appropriate variables into the 7: a$is8 and ; a$is8

<. )lic2 =>.

Output below is for two ?uestions: 7commit+8 and 7commit/8. &otice that there is a positive relationship.

@rom this scatterplot! would anticipate that the correlation between the two variables is positive and medium

in siAe.

2

1hat is the purpose of graphing the scatterplot3 The purpose of graphing the scatterplot is to loo2 at the

relationship between the variables and determine if there are an' problems.issues with the data or if the

scatterplot indicates an'thing uni?ue or interesting about the data! such as:

a. 6ow is the data dispersed3 @or e$ample! in the scatterplot above! it appears all the scores are grouped

in the top right ?uadrant. 1hat does this impl' about the ?uestions and.or data in 'our stud'3 t

appears that subBects answered both commit+ and commit/ on the higher part of the scale. n other

words! most subBects feel that most people brought to trial did in fact commit the crime ,commit+- and

that most people convicted b' Buries did in fact commit the crime ,commit/-. Thus! when discussing

these variables in 'our paper! Bust tal2ing about the siAe and direction of the correlation does not tell

the whole stor'. f 'ou want to also discuss descriptive anal'sis of the data! 'ou could tal2 about how

the data are distributed at the high end of the scale. n other words! Bust presenting the correlational

anal'sis ,e.g.! r C ./0! p C D.00+- ma' mislead the reader about an interesting distribution of the data.

b. %re there outliers3 % scatterplot is useful for 7e'eballing8 the presence of outliers. Eust as a histogram

is useful for 7e'eballing8 univariate outliers! the scatterplot is useful for 7e'eballing8 bivariate outliers.

n a later section describe how to statisticall' anal'Ae whether or not bivariate outliers e$ist.

c. s the relationship linear3

1hat is linearit'3 (inearit' is a straight*line relationship between variables.

1h' is linearit' important3 )orrelation and regression tests rests upon the assumption of linearit' because

the' onl' capture linear relationships. &ot all relationships are linear. Eust as not all variables are normall'

distributed in the real*world! not all relationships are supposed to be linear. @or e$ample! there could be a non*

linear relationship for FS% immigrants between length of residence in FS% and depression. t is a F*shaped

relationship. Depression levels starts high during the first few 'ears of initial resettlement! then decreases for a

while as the' adBust to the new environment! and then increases again later in life. %nother e$ample of a non*

linear relationship is mortalit' and water consumption. %bsence of water increases mortalit'! middle levels of

water decreases mortalit'! but too much water increases mortalit'.

)orrelation and regression onl' capture linear relationships. @or e$ample! all correlations below have the same

siAe and direction! r C .8+. 4FT onl' the top*left graph is appropriate for correlational anal'sis because the

other three graphs depict data that can &=T be captured b' the formulas for correlation and regression.

,graph ta2en from wi2ipedia-

/

2. i!ariate "an# $ulti!ariate outliers%

6ow do identif' bivariate and multivariate outliers3 The procedure for identif'ing bivariate outliers is the

same as for identif'ing multivariate outliers. The procedure is called 9ahalanobis Distances! and it calculates

the distance of particular scores from the center cluster of remaining cases. The procedure creates a new

column at the end of the data file containing a calculated score for each subBect. The newl' calculated score is

based upon the specific variables entered into the anal'sis. Thus! 'ou could calculate man' different

9ahalanobis Distances where 'ou enter different sets of variables into the anal'sis. @or each anal'sis! a

separate score for each subBect is created in a new column at the end of the data file. The 9ahalanobis

Distances score for each subBect is considered an outlier if it e$ceeds a 7critical value8.

a. The critical value is determined b' a table at the bac2 of most te$tboo2s. :ou can also find the table at

this webpage * http:..www.ento.vt.edu.Gsharov.Hop#col.tables.chis?.html

b. The table involves the 7"eBection "egions8 for a )hi*S?uare test. "emember bac2 to the first da' of

class when we tal2ed about probabilit' distributions and 7"eBection "egions8. The "eBection "egions

for the chi*s?uare test is based upon two factors: the probabilit' level 'ou set! and the degrees of

freedom. 1e will tal2 later in*depth about these concepts! but for right now what 'ou need to 2now is

that:

c. degrees of freedom for this test is e?ual to the number of variables under investigation. Thus! if 'ou are

anal'Aing a bivariate relationship! then degrees of freedom C 2. f 'ou are anal'Aing / variables! then

degrees of freedom C /! and so forth

d. the probabilit' level we set for this test is p D .00+.

e. so! if 'ou loo2 at the table! 'ou find the degrees of freedom! then scan to the right until 'ou get to the

column associated with 0.00+. That is 'our critical value. @or e$ample! the critical value for a bivariate

relationship is +/.82.

f. %n' 9ahalanobis Distances score above that critical value is an outlier.

6ere is how to calculate 9ahalanobis Distances scores:

+. Select &naly'e **> (egression **> Linear

2. 9ove all the variables under investigation into the 7ndependents8 bo$. n the 7Dependent8 bo$! move the

subBect number variable ,numb-. @or e$ample! if 'ou are interested in the bivariate outlier anal'sis for

7commit+8 and 7commit/8! 'ou move both those variables into the 7ndependent8 bo$! and move 7numb8 into

the 7Dependent8 bo$.

/. )lic2 7Save8! and clic2 79ahalanobis8

<. )lic2 =>.

The newl' created variable is saved as 79%6I+8

<

Output below is for 7commit+8 and 7commit/8. n 7"esiduals Statistics8 bo$! loo2 for 79ahal. Distance8.

(oo2 at the 79a$imum8 score. f that number e$ceeds 'our critical value! then an outlier e$ists. n this case!

with 2 variables! the critical value is +/.82. The 79a$imum8 is listed as /0.08/. Thus! we have at least one

outlier.

:ou identif' the outlier,s- b' sorting the data b' this new variable 79%6I+8! and then scroll to the bottom of

the list to find the highest valued scores. :ou can sort b': Data **> Sort Cases. n this case! we find ++

variables that have scores above +/.82.

&otice! however! that multivariate outlier anal'sis is Bust as arbitrar' as univariate outlier anal'sis. The

determination for the threshold level is arbitraril' determined! Bust as the threshold level for univariate outliers

as +.0J K" and /JK" is arbitraril' determined. Hlus! the 7e'eball8 method of the scatterplot does show some

differences when compared to the statistical method of using 9ahalanobis Distances scores. @or e$ample! if

'ou loo2 at the scatterplot for our two variables ,see above-! can 'ou identif' which ++ subBects are the ones

deemed outliers b' the 9ahalanobis Distances anal'sis3

3. Correlation

% correlation is eas' to conduct:

+. Select &naly'e **> Correlate **> i!ariate

2. 9ove all variables into the 7Variable,s-8 window.

/. )lic2 =>.

Output below is for two ?uestions 7commit+8 and 7commit/8. The 7)orrelations8 bo$ tells 'ou three pieces

of information: n C sample siAe! pearson C siAe and direction of the relationship! Sig. C significance level. n

essence! the 7Hearson )orrelation8 tells 'ou siAe and direction of the h'pothetical line that can be drawn

through the dataL and 7Significance8 tells 'ou the probabilit' that the line is due to chance. 9ore specificall'!

the 7Significance8 represents a test of whether the line is different from a flat line ,e.g. a flat line would be

represented b' a Hearson correlation C 0-. @or the data below! there is a positive and medium relationship

between the variables! and there is a p<.00+ probabilit' that the line is due to chance.

0

%nother useful piece of information is "

2

the coefficient of determination. This is the amount of variance

e$plained b' another variable. "

2

is not provided in the output! but 'ou can calculate "

2

b' s?uaring the

Hearson )orrelation. n our e$ample! therefore! /02 $ ./02 C .+2<. f 'ou multiple this b' +00! 'ou converted

the value into a percentage. Thus! in our e$ample! commit+ e$plains +2.<M of the variance in commit/! and

vice versa. This also means that 8N.OM of the variance is unaccounted because +00*+2.< C 8N.O.

1"T#*FH: The report of a correlational stud' should include the strength of the relationship and the

significance level. Some researchers also include the descriptive statistics of each variable. Some researchers

also include the "

2

a. 7There was a positive correlation between the two variables! r C ./0! p C D.00+.8

b. 7There was a positive correlation between the belief about what percent of people brought to trial did

in fact commit the crime ,M C N8./5M SD C +O.//- and the belief about what percent of people

convicted b' Buries did in fact commit the crime ,M C 8/.22M SD C +0.0<- ! r C ./0! p C D.00+.8

c. 7There was a positive correlation between the two variables! r C ./0! p C D.00+! with a R

2

C .+2<.8

#V%(F%T=&:

a. :ou evaluate correlational anal'sis b' loo2ing at the direction of the relationship between the

variables. s it in the same direction as the research h'pothesis.

b. :ou then loo2 at the significance level. s the relationship significant3 "emember that significance is

related to sample siAe. n small sample ,nC/0- 'ou ma' have correlations that donPt reach significance!

but if the sample siAe was larger ,nC+00-! it would be significant. %lso! remember that sample siAe

does not t'picall' affect the strength of the relationship! onl' the probabilit' that the result was due to

chance.

c. :ou then loo2 at the siAe of the relationship. s it strong or wea23 Eust because the h'pothesis is

confirmed in the predicted direction does not indicate if the relationship between the variables is strong

or important. Strength of the relationship is measured from 0 to +.*+. The farther the value is awa'

from 0! the stronger the relationship. The appro$imate criteria for strength is 0 for no effect! .+ for a

small effect! ./ for a medium effect! and .0 for a large effect. &otice those values can be either positive

or negative! depending upon the direction of the relationship! so a .2 and *.2 relationship indicate the

same strength! but different direction.

d. :ou can also loo2 at "

2

. n terms of percentage of variance e$plained! small is +M! medium is 5M! and

large is 20M.

). Correlation - *ultiple

1hen 'ou conduct correlations! 'ou t'picall' enter 9%&: variables simultaneousl' into the anal'sis! and the

output provides all possible bivariate relationships. @or e$ample:

+. Select &naly'e **> Correlate **> i!ariate

2. 9ove all variables into the 7Variable,s-8 window.

/. )lic2 =>.

Output below is +or the ,+orensic- ite$s an# ,innocence- ite$s. &otice the diagonal is alwa's 7+8 because

there is a perfect correlation between the same variable. %lso notice that sample siAe is different for each

bivariate relationship because the default in correlation is 7pairwise8 deletion. %lso! notice that the matri$ is a

mirror of itself along the diagonal! so the information is depicted twice for each bivariate combination.

O

Eust as 'ou can have correlational output of multiple variables simultaneousl'! 'ou can have scatterplots of

multiple variables simultaneousl'. The onl' limitation is that if there are more than / variables simultaneousl'

the scatterplots get so small as to be relativel' useless. :ou conduct multiple scatterplots simultaneousl' b':

+. Select Graphs **> Legacy Dialogs **> Scatter

2. )lic2 79atri$8! and 7Define8

/. 9ove appropriate variables into the 79atri$ variables8 bo$

<. )lic2 7=ptions8 and 7e$clude cases variable b' variable8

0. )lic2 =>.

=utput below is for the first three 7forensic items8.

.. Correlation - /artial

Hartial correlation is the relationship between two variables while controlling for a third variable. The purpose

is to find the uni?ue variance between two variables while eliminating the variance from a third variables. The

diagram below from 'our te$tboo2 page +/0 graphicall' represents the purpose of partial correlation.

N

:ou t'picall' onl' conduct partial correlation when the third variable has shown a relationship to one or both

of the primar' variables. n other words! 'ou t'picall' first conduct correlational anal'sis on all variables so

that 'ou can see whether there are significant relationships amongst the variables! including an' 7third

variables8 that ma' have a significant relationship to the variables under investigation. n addition to this

statistical pre*re?uisite! 'ou also want some theoretical reason wh' the third variable would be impacting the

results.

6ow to conduct partial correlation:

+. Select &naly'e **> Correlate **> /artial

2. 9ove variables into the 7Variable,s-8 window.

/. 9ove the variable 'ou want to control for into the 7)ontrolling8 bo$

<. )lic2 7=ptions8 and clic2 7Qero =rder correlations8 and clic2 7#$clude cases pairwise8

,b' clic2ing 7Aero order correlations8! the output will show both the relationships amongst the variables

while controlling for the third variable! and %(S= the relationships amongst the variables without

controlling for the third variable. This is useful so that 'ou can easil' see the difference between

controlling for the variable and not controlling for the variable.-

0. )lic2 =>.

=utput below is for the relationship between 7commit+8 and commit/8 while controlling for 7prosecutor+8.

included 7prosecutor+8 as the controlling variable because: ,+- statisticall'! it shows significant relationship

to both commit+ and commit/. :ou can see that significant relationship in the top part of the 7)orrelations8

bo$ below which presents the correlations without controlling for a third variable! ,2- theoreticall'! it is

possible that the reason wh' there is a positive correlation between commit+ and commit/ is because

prosecutor+ as2s 7whom do 'ou trust more! defense attorne's or prosecutors8! so it is possible the reason wh'

8

subBects believe defendants brought to trial and convicted at trial are guilt' ,commit+ and commit/- is because

the' trust the prosecutor over the defense attorne'.

Thus! given this plausible ,statistical and theoretical- third*variable relationship! it is interesting to note that

controlling for 7prosecutor+8 did not lower the strength of the relationship between commit+ and commit/ b'

that much because the outcome while controlling for prosecutor+ was r C ./<+! p D.00+. n other words! the

relationship between commit+ and commit/ is &=T due to subBects trusting the prosecutor.

:ou can conduct Hartial )orrelation with more than Bust + third*variable. :ou can include as man' third*

variables as 'ou wish.

0. Correlation /oint-biserial Correlation

Hoint*biseral )orrelations are conducted when one of the variables is dichotomous! which means itPs a

categorical variable with onl' two categories! such as gender: male! female.

@: The Hoint*biserial )orrelation is analogous to a 7t*test8! which we will cover in later wee2s. % 7t*test8

is conducted when 'ou are interested in the relationship between a categorical independent variable ,such as

gender: male! female- and a continuous dependent variable ,such as belief in the death penalt' on +*N scale-.

:ou conduct a Hoint*biserial )orrelation the same wa' that 'ou conduct a regular correlation:

+. Select &naly'e **> Correlate **> i!ariate

2. 9ove all variables into the 7Variable,s-8 window.

/. )lic2 =>.

The output! 1rite*up! and interpretation are the same as for a regular correlation.

@: * f 'ou want to anal'Ae the tests from 'our classes using the Hoint*biserial )orrelation! 'ou would need

to first create a new dichotomous variable ,e.g.! +Canswered correctl'! 2Canswered incorrectl'-.

See 7(ab2 Descriptives8 for 7O. Transforming categorical variables into other categorical variables8.

5

- Case Study 22 QA Business Quantitative AnalysisEnviado porjdrukis01
- Chapter 3 Methods and Procedures This ChapterEnviado porchocoholic potchi
- Shirley Trash AnalysisEnviado porAditya Shrivastava
- chapter4-correlationEnviado porapi-19983487
- 8678Core Material NotesEnviado porJames Nola
- Adjustment among M.Tech students of Banasthali UniversityEnviado porIJSRP ORG
- Research MethodologyEnviado pornsmu838
- QM1Enviado porGuru Prasad
- A Novel Single-Valued Neutrosophic Set Similarity Measure and Its Application in Multicriteria Decision-MakingEnviado porMia Amalia
- Syllabus-ITEMicrocontrollerEnviado porDaWheng Vargas
- Arun Deepika Abhishek Kririka JitendraEnviado porVipluv Rana
- 8 Correlations [Session 8]Enviado porRahul Agarwal
- PHYSA_13574Enviado porStan Dragos
- Correlation and Dependence - Wikipedia, The Free EncyclopediaEnviado porKhairy Elsayed
- Chapter 05Enviado porAdeniran Bamidele
- Assumptions Summer2003Enviado porIan Pratama
- SPSS to WordEnviado porsamson5e
- EJAP Sprint Lemaitre 2012Enviado porGogy Mark
- practicum direct lesson planEnviado porapi-284899375
- Roychowduri Earnings Management Through Real Activities ManipulationEnviado porarfankaf
- 4 research design qodEnviado porapi-308082215
- ch4projectoliviaandmaddieEnviado porapi-346023709
- Artikel 2, Flexible IT InfraEnviado porFahmi A. Mubarok
- 11-jitim-23-2-2014-manchanda-mukherjee-47-58-4Enviado porB
- Statistics With LaboratoryEnviado porJoseph TheThird
- 3 Chapter-2Enviado porLen-Len Cobsilen
- Robust Statistical Pearson Correlation Diagnostics for Bitcoin Exchange Rate with Trading Volume: An Analysis of High Frequency Data in High Volatility EnvironmentEnviado porIjaems Journal
- A study of correlation between conceptual understanding and achievement in Biology subject of standard 10thEnviado porEditor IJTSRD
- hansenaiseimch03Enviado porCharlet_D
- 0 Occupational Status Measures for Isco 08-GanzeboomEnviado porMillea Vlad

- Mobile Banking QuestionnaireEnviado porMd Shohag Ali
- Is the goal of maximizing the stock price good for society.docEnviado porMd Shohag Ali
- Effect of Yearly Average High Tide on Average Electric Conductivity of the river water.docEnviado porMd Shohag Ali
- CV TemplatesEnviado porMd Shohag Ali
- The Shadow Lines.docxEnviado porMd Shohag Ali
- Marketing Audit ProcessEnviado porMd Shohag Ali
- CSE Tutorial and AnswerEnviado porMd Shohag Ali
- marketingaudit-finalppt-130804120714-phpapp01Enviado porMd Shohag Ali
- GrameenPhone Internship Recruitment NoticeEnviado porMd Shohag Ali
- 46698922 Hybrid Financing InstrumentsEnviado porMd Shohag Ali
- Marketing AuditEnviado porTamim Arefi
- Marketing AuditEnviado porMd Shohag Ali
- Beautiful Bangladesh AnthoEnviado porMd Shohag Ali
- Hybrid FinancingEnviado porMd Shohag Ali
- 8_EIJMMS_VOL1_ISSUE2Enviado porarpitloya
- Sajadul CvEnviado porMd Shohag Ali
- Gram Een PhoneGrameenPhone Internship Recruitment NoticeEnviado porMd Shohag Ali
- Front Page PharmacyEnviado porMd Shohag Ali
- The Importance ofEnviado porMd Shohag Ali
- ApplicationEnviado porMd Shohag Ali

- 02 TestsEnviado porValentin Adam
- abc into the heat of excel.pdfEnviado porBubul
- edaEnviado porlmartinezr9017
- INST627_Spring2018_SyllabusEnviado porhghaedi66
- Solution_Midterm Exam_331_W2010Enviado poryungP
- univariate time series.pptEnviado porVishnuChaithanya
- Explain why business organizations use statistical analysis of market dataEnviado porThung Wong
- Multivariate Correlation (MRP Bisuit Company)Enviado porShreyansh Choudhary
- pscoreEnviado porApam Benjamin
- Stpm Mathematics Paper 3 2013TEnviado porKwongKH
- Error analysis lecture 10Enviado porOmegaUser
- Spss Simple ManualEnviado porAbir Mohammad
- Paper OpriskEnviado poraxbata
- Adams Akano Asemota-Nigeria ElectricityEnviado porDiana Beltran
- XLMinerUserGuide.pdfEnviado porankurlibra
- jbptunikompp-gdl-sonnyanurm-35763-8-unikom_s-lEnviado porfitri kartika sari
- class4Enviado porNdivhuho Neosta
- Scatterplots PPT 2016-2017 Students (1)Enviado porChristian Davis
- Demand Money BurundiEnviado porjndenzako
- TheAccuracyofExtrapolativeTimeSeriesMethods-ResultsofaForecastingCompetition.pdfEnviado porRUSNIANTI NUR
- Estimation, Filtering and Adaptive ProcessesEnviado pormehdicheraghi506
- Assignment 7Enviado porGary Ashcroft
- ArimaыEnviado porKhuslen Amgalan
- Nonlinear RegressionEnviado porannisa_alwa
- Time Series Analysis by State Space Methods by Durbin and KoopmanEnviado porshishirkashyap
- Statdisk User ManualEnviado pormspandey2000
- Learn-Data science-Learnbay.pdfEnviado porVinay Sathyanarayan
- Reporting_Statistics_in_APA_FormatEnviado porAbbas Smiley
- Effect of Information Technology on the Reliability of Accounting InformationEnviado porPutri Hasianna
- Stats Project.docxEnviado porAnonymous PCNnRVKMf