Escolar Documentos
Profissional Documentos
Cultura Documentos
Lesson Objectives
ls a u d i Res
A continuation of regression analysis
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 1
q q
Continue to build on regression analysis . Learn how residual plots help identify problems with the analysis.
Example 1:
continued
Example 1, continued
220 200 WEIGHT 180 160 140 120 100 60 Y = 332.7 + 7.189X
Residuals = distance from point to line, measured parallel to Y- axis.
Case X 1 2 3 4 5
64 68 HEIGHT
72
76
Example 1, continued
Compute the fitted value and residual for the 4th person in the sample; i.e., X = 72 inches, Y = 207 lbs.
fitted value = ^ y 4 = -332.73 + 7.189(
= _________
^4 residual = e4 = y4 - y
=
Department of ISM, University of Alabama, 1992-2003
ei = yi - ^ yi
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 5
= __________
220
e4 = +22.12.
WEIGHT
al u d i s Re Plots
Scatterplot of residuals vs. the predicted means of Y, ^ Y; or an X -variable.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 7
Y = 332.7 + 7.189X
60
64 68 HEIGHT
72
76
Example 1, continued
24 16
Residual Plot
Residuals
8 0
Residual Plot
Scatterplot of residuals versus ^; the predicted means of Y, Y or an X-variable , or Time .
Expect dispersion around a horizontal line at zero.
-8
72
random
-16 -24
76
M23- Residuals & Minitab 9
60
64 68 HEIGHT
Residuals versus X
Residuals versus X
l l l l l l l l l l l l ll l l l l l l l l
X, or time
M23- Residuals & Minitab 11
Residuals
Residuals
l ll l l ll l l ll l l l l l l l l ll l l
Outliers?
Department of ISM, University of Alabama, 1992-2003
X, or time
M23- Residuals & Minitab 12
l l l l ll l lll l l l l l l 0 l ll l l l ll l l ll
Nonlinear relationship
Residuals
Residuals
l l l l ll l l l l l l l l l l 0 ll l l l l l ll l ll l l l l l l l ll l
Variance is increasing
X, or time
X, or time
Unusual cases:
q Outliers q High leverage cases q Influential cases
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 15
Y = Weight X = Height
^ Y = 332.73 + 7.189 X
Residuals
Property 1.
X 73 68 67 72 62
^ e= Y^ Y Y Y 175 192.07 17.07 Find sum 158 156.12 1.88 the of the residuals. 140 . 207 . 115 . .01 round -off error
ei = 0.
^ Y = 332.73 + 7.189 X ^ Y
e= Y^ Y e2 17.07 291.38 1.88 3.53 8.93 79.74 22.12 489.29 2.01 4.04 .01 867.98
Property 2.
Properties of Least Squares Line 1. Residuals always sum to zero. 2. This least squares line produces a smaller Sum of squared residuals than any other straight line can.
Property 3.
X = 68.4, Y = 159
Properties of Least Squares Line 1. Residuals always sum to zero. 2. This least squares line produces a smaller Sum of squared residuals than any other straight line can. 3. Line always passes through the point ( x, y ).
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 22
64 68 X HEIGHT
72
76
outlier
l l ll ll l l l l lll l l
M23- Residuals & Minitab 23
Unusual point does not follow pattern pattern. Its near the XX -mean mean; the entire line pulled toward it.
outlier
Unusual point is far from the XX-mean mean, but does not follow the pattern pattern. Line really twists!
l l ll ll l l l l lll l l
influential
Department of ISM, University of Alabama, 1992-2003
Definitions: continued
Influential Case
has an
l l l l l l l l l l ll l l l
Unusual point does not follow pattern pattern. The line is pulled down and twisted slightly slightly.
l
High leverage
l ll l l l l l l lll l l
Definitions:
Outlier:
An unusual y-value relative to the pattern of the other cases.
&
not resistant
to unusual cases.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 31
Lesson Objectives
Example 3, continued
Learn two ways to use Minitab to run a regression analysis. Learn how to read output from Minitab.
DTDP
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 34
84 80 76 Height 72 68 64 60
Jitter added in X-direction.
Female Male
Scatterplot
The scatter for 56each subpopulation is 5 about the same; i.e., there is constant variance.
6 7
8 9 10 11 12 13 14 15 Shoe Size
M23- Residuals & Minitab 35
Example 3, continued
Stat
Method 1
Regression Regression
Y = a + bX
R-Sq = 79.1%
R-Sq(adj) = 79.0%
Analysis of Variance Source DF Regression 1 Error 255 Total 256 SS 3650.0 966.3 4616.3 MS 3650.0 3.8 F P 963.26 0.000
Example 3, continued
R-Sq = 79.1%
SE Coef R-Sq = T = P 0.5912 85.45 TSS 0.000 4616.3 0.06033 31.04 0.000 R-Sq(adj) = 79.0%
SSR
3650.0
R-Sq = 79.1%
Analysis of Variance Source DF Regression 1 Error 255 Total 256 SS 3650.0 966.3 4616.3 MS 3650.0 3.8 F P 963.26 0.000
Example 3, continued
Measure of variation around Predictor Coef SE Coef Constant 50.5230 line. 0.5912 the regression Shoe Siz 1.87241 0.06033
S = 1.947 R-Sq = 79.1%
T 85.45 31.04
P 0.000 0.000
R-Sq(adj) = 79.0%
S = MSE = 3.8
Analysis of Sum ofVariance Source DF Regression 1 Error 255 Total 256
squared residuals
MS 3650.0 3.8
Error MSE
Residual
-5 5 10 15
Shoe Siz
No Jitter added.
M23- Residuals & Minitab 41
Example 3, continued
10
Method 2
Regression
Y = a + bX
Example 3, continued
60
10
11
12
13
14
15
Shoe Size
11
R-Sq = 79.1%
R-Sq(adj) = 79.0%
Analysis of Variance Source DF Regression 1 Error 255 Total 256 SS 3650.0 966.3 4616.3 MS 3650.0 3.8 F P 963.26 0.000
12
If the P-value IS SMALL (typically < 0.10), then conclude: 3 1. It is unlikely that the true coefficient is really zero, and therefore, 2. The X variable IS a useful predictor for the Y variable. Keep the variable! If the P-value is NOT SMALL (i.e., > 0.10), then conclude: 1. For all practical purposes the true coefficient MAY BE ZERO; therefore 2. The X variable IS NOT a useful predictor of the Y variable. Dont use it.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 48
13
Could shoeAnalysis: size Regression Height versus Shoe Size have a true t measures how many standard The regression is the estimated coefficient coefficient that equation errors Height = 50.5 + 1.87 Shoe Size is from zero. is actually zero?
Predictor Constant Shoe Siz S = 1.947 Coef 50.5230 1.87241 SE Coef 0.5912 0.06033 T 85.45 31.04 P 0.000 0.000
R-Sq = P-value: 79.1% a measure R-Sq(adj) 79.0% of the = likelihood that the true coefficient is zero.
Conclusion: Source DF SS MS F P Regression 1 size 3650.0 3650.0 963.26 0.000 The shoe coefficient is NOT zero! Error 966.3 3.8 Shoe 255 size 4616.3 IS a useful predictor Total 256
Department of ISM, University of Alabama, 1992-2003
49
statistical inference.
This will be covered in more detail during the last three weeks of the course.
14