Escolar Documentos
Profissional Documentos
Cultura Documentos
Correlation
Explanatory
(Independent) Variable
Hours of Training
Shoe Size
Cigarettes smoked per day
Score on SAT
Height
Response
(Dependent) Variable
Number of Accidents
Height
Lung Capacity
Grade Point Average
IQ
Correlation
measures and describes the strength and direction of
the relationship
Bivariate techniques requires two variable scores
from the same individuals (dependent and
independent variables)
Multivariate when more than two independent
variables (e.g effect of advertising and prices on
sales)
Variables must be ratio or interval scale
Accidents
60
50
40
30
20
10
0
0
10 12 14 16 18 20
Hours of Training
GPA
x = SAT score
y = GPA
4.00
3.75
3.50
3.25
3.00
2.75
2.50
2.25
2.00
1.75
1.50
300 350 400 450 500 550 600 650 700 750 800
Math SAT
IQ
x = height
y = IQ
160
150
140
130
120
110
100
90
80
60
64
68
72
Height
No linear correlation
76
80
but non-linear!
Correlation Coefficient r
0
If r is close to
0 there is no
linear
correlation.
1
If r is close to
1 there is a
strong
positive
correlation.
Outliers.....
Outliers are dangerous
Application
Final Grade
Final
Absences Grade
95
90
85
80
75
70
65
60
55
50
45
40
8
10
Absences
X
12
14
16
x
8
2
5
12
15
9
6
y
78
92
90
58
43
74
81
Computation of r
1
2
3
4
5
6
7
8
2
5
12
15
9
6
78
92
90
58
43
74
81
57 516
xy
624
184
450
696
645
666
486
x2
64
4
25
144
225
81
36
y2
6084
8464
8100
3364
1849
5476
6561
3751
579
39898
(rho).
Test of Significance
The correlation between the number of times absent and a
final grade r = 0.975. There were seven pairs of data.Test the
significance of this correlation. Use
= 0.01.
Rejection Regions
Critical Values t0
t
4.032
4.032
df\p
0.40
0.25
0.10
0.05
0.025
0.01
0.005
0.0005
0.324920
1.000000
3.077684
6.313752
12.70620
31.82052
63.65674
636.6192
0.288675
0.816497
1.885618
2.919986
4.30265
6.96456
9.92484
31.5991
0.276671
0.764892
1.637744
2.353363
3.18245
4.54070
5.84091
12.9240
0.270722
0.740697
1.533206
2.131847
2.77645
3.74695
4.60409
8.6103
0.267181
0.726687
1.475884
2.015048
2.57058
3.36493
4.03214
6.8688
t
4.032
+4.032
(xi,yi)
= a data point
= a point on the line with the same x-value
= a residual
Best fitting straight line
260
revenue
250
240
230
220
210
200
190
180
1.5
2.0
Ad $
2.5
3.0
1
2
3
4
5
6
7
x
8
2
5
12
15
9
6
xy
y
78
92
90
58
43
74
81
624
184
450
696
645
666
486
57 516
3751
x2
64
4
25
144
225
81
36
y2
6084
8464
8100
3364
1849
5476
6561
579 39898
Calculate m and b.
= 3.924x + 105.667
Final Grade
10
12
14
16
Absences
Predicting y Values
The regression line can be used to predict values of y
for values of x falling within the range of the data.
The regression equation for number of times absent and final grade is:
= 3.924x + 105.667
Use this equation to predict the expected grade for a student with
(a) 3 absences
(b) 12 absences
(a)
(b)