Você está na página 1de 24

Correlation Analysis

CEIT 421

Introduction

Think about these cases ?


A university decided to look at the relationship between
calcium intake and knowledge about calcium in sports
science students.
The local ice cream shop wants to examine the
correlation between the amount of much ice cream they
sell and the temperature on one day in summer.
A researcher decides to determine whether there is an
relationship between high school grades and family
income.

Introduction
Correlation is a statistical technique that is used to measure
and describe the relationship between two variables.
In correlation analysis there is no attempt to control or
manipulate the variables.
A correlation requires two scores for each individual.
The most common correlation is the Pearson correlation (or
the Pearson product-moment correlation)
Correlation is denoted as a correlation coefficient

Introduction
Representing Correlation in Graph - Scatterplot
The scatterplot is the simplest of all the multiple-variable graphs.
Use scatterplots to determine the relationship between two
continuous variables and to discover whether two continuous
variables are correlated. Correlation indicates how closely two
variables are related.

The Characteristics of A Relationship


1- The Direction of the Relationship
The sign of the correlation, positive or negative, describe
the direction of the relationship.
In a positive correlation, the two variables tend to change
in the SAME direction
In a negative correlation, two variables tend to go in
OPPOSITE direction.

The Characteristics of A Relationship


Positive Correlation vs Negative
Correlation
If a correlation is positive, it means that:

If a correlation is negative, it means that:

if one variable (x) increases, the other (y)


increases, or if one variable decreases (x),
the other (y) decreases.

if one variable (x) increases, the other (y)


decreases, or if one variable (x)
decreases the other (y) increases.

The Characteristics of A Relationship


2- The Form of the Relationship
In correlation,
the relationship tend to have a linear form or direction.
the points in the scatterplot tend to cluster around a
straight line
Negative

Positive

The Characteristics of A Relationship


3- The Strength of the Relationship
The following guidelines for interpreting the strengths of positive (+) or
negative (-) correlations. The value of r is always between +1 and 1 (-1r+1)

When r =

When r =

+.70 or higher Very strong positive


relationship

-.01 to -.19 No or negligible relationship

+.40 to +.69Strong positive relationship

-.30 to -.39 Moderate negative relationship

+.30 to +.39Moderate positive


relationship

-.40 to -.69 Strong negative relationship

+.20 to +.29weak positive relationship


+.01 to +.19No or negligible relationship

-.20 to -.29 weak negative relationship

-.70 or higher Very strong negative


relationship

The Characteristics of A Relationship

3- The Strength of the Relationship in graph Scatterplot

The kind of Variables


Continuous Variables
Categorical Variables

vs

Interval

Nominal

Ratio

Dichotomous
Ordinal

Correlation Coefficient - r
How to calculate r by hand
Subje Age
ct
(X)
1
43
2
21
3
25
4
42
5
57
6
59

247

Glucose
Level (Y)

99
65
79
75
87
81
486

X*X

Y*Y

X*Y

1849 9801 4257


441 4225 1365
625 6241 1975
1764 5625 3150
3249 7569 4959
3481 6561 4779
11409 40022 20485

x = 247
y = 486
xy = 20,485
(x*x)= 11,409
(y*y)= 40,022
n is thesample size, in our
case = 6

The correlation coefficient (r)= 6(20,485) (247 486) / [[[6(11,409) (247*247)]


[6(40,022) (486*486)]]]
=0.5298

Limitation
Correlation is not a causation
Correlation doesn't imply causation and there is no way
to determine or prove causation from a correlational
study.

Example
Research question: Is there a relationship between calcium intake and knowledge
about calcium in sports science students?
Variables: Continuous
1-) Knowledge score (Out of 50)
2-) Calcium intake (mg/day)
Hypotheses:
The 'null hypothesis' might be:
H0: There is no correlation between calcium intake and knowledge about calcium in
sports science students (equivalent to saying r = 0)
The 'alternative hypothesis' might be:
H1: There is a correlation between calcium intake and knowledge about calcium in
sports science students (equivalent to saying r 0),

Example
1- Open IBM Statistics software
2- Click on Variable View button and write the name of your variables without any blank

Example
3- Click on Data View button and enter the data corresponding to each
variable

Example
4- Go to Analyze > Correlate and click Bivariate

Example
5- Drag two variables from left box to right box by using the middle-centered icon
6- Ensure the Pearson box is ticked, then click OK button

Example
7- Look at the first row corresponding to Pearson Correlation
8- Analyze correlation output in terms of following aspects
indicate Correlation
coefficient value r value
(r=0.94)
indicate the significance of the
correlation - p value (p<0.05).

indicate sample size (N=10)

Example
9- Interpret the results
Table is
prepared
based on
the APA
style

Pearson product-moment correlation was conducted to investigate the


relationship between calcium intake and knowledge about calcium in sports
science students. Results, as shown in Table 1, indicated that there was very
strong positive correlation between calcium intake and knowledge about
calcium in sports science students, which is significant, r(8)=0.94, p<0.05. It
can be concluded that science students who have more the knowledge
about calcium tend to have high calcium intake or vice versa.

Example
Draw a scatter plot
1- Go to Graphs> Legacy Dialogs options and choose
Scatter/Dot

Example
Draw a scatter plot
2- Select Simple Scatter and then click Define button

Example
Draw a scatter plot
3- Drag one variable into Y Axis and the other variable into X
Axis, then click OK

Example
Draw a scatter plot
4- interpret the scatter plot output

It can be perceived from the scatter plot


that the points are reasonably closely
scattered about an underlying straight line
(as opposed to a curve or nothing), so we
say there is a strong linear relationship
between the two variables. The scatter
plot implies that as the knowledge score
increases so the calcium intake increases.

Classroom Activity

Ice Cream Sales

Você também pode gostar