Escolar Documentos
Profissional Documentos
Cultura Documentos
Regression
OBJECTIVES
• Define what is meant by Linear
Regression and Correlation
• Construct the Line of Regression by
calculating Intercept and slope on a
scattergram
• Interpret the relationships using
predictions, r and r2
When two or more variables are
measured on each individual, we can
estimate the relationship of one
variable with another in terms of linear
function of one on other.
20
15
10
No of 5
Rings
in a tree 0
0 5 10 15 20 25 X
age in years
Y=X
5
3
Ht in mm
(tree) 2
0
0 5 10 15 20 25 X
age in days
Y = bX
b = 1/7 = 0.143
Hence Y= 0.143 X
60
50
40
30
20
10
0 1 2 3 4 5 6 7 X
mcg of drug/mm3 in blood
Y = a+ bX
Regression Analysis
The regression equation: Y= a + bX, where:
• Y is the average predicted value of Y for
any X.
• a is the Y-intercept. It is the estimated Y
value when X=0
• b is the slope of the line, or the average
change in Y for each change of one unit in
X
• the least squares principle is used to
obtain a and b.
Regression Analysis
• The least squares principle is used
to obtain a and b. The equations to
determine a and b are:
n( XY ) ( X )( Y )
b
n( X 2 ) ( X ) 2
Y X
a b
n n
EXAMPLE 1
• Nadeem Arshad, the student body president
at Punjab University, is concerned about the
cost to students of textbooks. He believes
there is a relationship between the number of
pages in the text and the selling price of the
book.
• To provide insight into the problem he
selects a sample of eight textbooks currently
on sale in the bookstore. Draw a scatter
diagram. Compute the correlation
coefficient.
EXAMPLE 1 continued
100
90
Price ($)
80
70
60
400 500 600 700 800
Page
Example 1 continued
636 4,900
a 0.05143 48 .0
8 8
Example continued
80
60
40
30 35 40 45 50
Knee circumference (cm)
2
y
0
-1
-2
-2 -1 0 1 2
x
4
y
2
2
1
1
0
-2 -1 0 1 2
y
0 x
-1
-2
-2 -1 0 1 2
x
Perfect Negative Correlation
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Perfect Positive Correlation
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Zero Correlation
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Strong Positive Correlation
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
The Coefficient of Correlation, r
The Coefficient of Correlation (r) is a
measure of the strength of the
relationship between two variables.
It requires interval or ratio-scaled data.
It can range from -1.00 to 1.00.
( X X )(Y Y )
r
(n 1) s x s y
n(XY ) (X )(Y )
n(X 2
) (X ) n Y Y
2 2
2
Coefficient of Determination
The coefficient of determination (r2) is
the proportion of the total variation in the
dependent variable (Y) that is explained
or accounted for by the variation in the
independent variable (X).
100
90
Price ($)
80
70
60
400 500 600 700 800
Page
Example 1 continued
0.614
r2 = 0.3769