Você está na página 1de 14

ST 260, M23 Residuals & Minitab

Lesson Objectives

ls a u d i Res
A continuation of regression analysis
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 1

q q

Continue to build on regression analysis . Learn how residual plots help identify problems with the analysis.

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 2

Example 1:
continued

Sample of n = 5 students, Y = Weight in pounds, X = Height in inches.

Example 1, continued

220 200 WEIGHT 180 160 140 120 100 60 Y = 332.7 + 7.189X


Residuals = distance from point to line, measured parallel to Y- axis.

Case X 1 2 3 4 5

Y Prediction equation: ^ = 332.73 + 7.189 Ht Wt


r-square = ? Std. error = ?
To be found later.

73 175 68 158 67 140 72 207 62 115

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 3

Department of ISM, University of Alabama, 1992-2003

64 68 HEIGHT

72

76

M23- Residuals & Minitab 4

Example 1, continued

Calculation: For each case,


residual = observed value estimated mean

Compute the fitted value and residual for the 4th person in the sample; i.e., X = 72 inches, Y = 207 lbs.
fitted value = ^ y 4 = -332.73 + 7.189(

For the ith case,

= _________
^4 residual = e4 = y4 - y
=
Department of ISM, University of Alabama, 1992-2003

ei = yi - ^ yi
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 5

= __________

M23- Residuals & Minitab 6

Department of ISM, University of Alabama, 1992-2003

ST 260, M23 Residuals & Minitab


Example 1, continued

220

e4 = +22.12.

WEIGHT

al u d i s Re Plots
Scatterplot of residuals vs. the predicted means of Y, ^ Y; or an X -variable.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 7

200 180 160 140 120 100

Y = 332.7 + 7.189X

60

Residuals = distance from point to line, measured parallel to Y- axis.

Department of ISM, University of Alabama, 1992-2003

64 68 HEIGHT

72

76

M23- Residuals & Minitab 8

Example 1, continued

24 16

Residual Plot

Residuals

8 0

Regression line from previous plot is rotated to horizontal.

e4 is the residual for the 4th case, = +22.12.

Residual Plot
Scatterplot of residuals versus ^; the predicted means of Y, Y or an X-variable , or Time .
Expect dispersion around a horizontal line at zero.

-8

72

random

-16 -24

76
M23- Residuals & Minitab 9

60

Department of ISM, University of Alabama, 1992-2003

64 68 HEIGHT

Problems occur if: Unusual patterns Unusual cases


Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 10

Residuals versus X

Residuals versus X

l l l l l l l l l l l l ll l l l l l l l l
X, or time
M23- Residuals & Minitab 11

Residuals

Residuals

l ll l l ll l l ll l l l l l l l l ll l l

Next step: ________ to determine if a recording error has occurred.

Good random pattern


Department of ISM, University of Alabama, 1992-2003

Outliers?
Department of ISM, University of Alabama, 1992-2003

X, or time
M23- Residuals & Minitab 12

Department of ISM, University of Alabama, 1992-2003

ST 260, M23 Residuals & Minitab


Residuals versus X Residuals versus X

l l l l ll l lll l l l l l l 0 l ll l l l ll l l ll
Nonlinear relationship

Residuals

Residuals

Next step: Add a quadratic term, or use ______.

l l l l ll l l l l l l l l l l 0 ll l l l l l ll l ll l l l l l l l ll l
Variance is increasing

Next step: Stabilize variance by using ________.

X, or time

X, or time

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 13

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 14

Residual Plots help identify Unusual patterns:


q Possible curvature in the data. q Variances that are not constant as X changes.

Three properties of Residuals


illustrated with some computations.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 16

Unusual cases:
q Outliers q High leverage cases q Influential cases
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 15

Y = Weight X = Height

^ Y = 332.73 + 7.189 X
Residuals

Property 1.

Properties of Least Squares Line 1. Residuals always sum to zero.

X 73 68 67 72 62

^ e= Y^ Y Y Y 175 192.07 17.07 Find sum 158 156.12 1.88 the of the residuals. 140 . 207 . 115 . .01 round -off error

ei = 0.

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 18

Department of ISM, University of Alabama, 1992-2003

ST 260, M23 Residuals & Minitab


Y = Weight X = Height

^ Y = 332.73 + 7.189 X ^ Y
e= Y^ Y e2 17.07 291.38 1.88 3.53 8.93 79.74 22.12 489.29 2.01 4.04 .01 867.98

Property 2.

Properties of Least Squares Line 1. Residuals always sum to zero. 2. This least squares line produces a smaller Sum of squared residuals than any other straight line can.

X Y 73 175 68 158 67 140 72 207 62 115

192.07 156.12 148.93 184.88 112.99

Find the sum of squares of the residuals.

ei2 = SSE = 867.98 <


Department of ISM, University of Alabama, 1992-2003

SSE for any other line.

M23- Residuals & Minitab 20

220 200 WEIGHT 180 160

Property 3.

X = 68.4, Y = 159

Properties of Least Squares Line 1. Residuals always sum to zero. 2. This least squares line produces a smaller Sum of squared residuals than any other straight line can. 3. Line always passes through the point ( x, y ).
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 22

140 120 100 60

Department of ISM, University of Alabama, 1992-2003

64 68 X HEIGHT

72

76

M23- Residuals & Minitab 21

Illustration of unusual cases:


q q q

outlier

Outliers Leverage Influential

l l ll ll l l l l lll l l
M23- Residuals & Minitab 23

Unusual point does not follow pattern pattern. Its near the XX -mean mean; the entire line pulled toward it.

Department of ISM, University of Alabama, 1992-2003

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 24

Department of ISM, University of Alabama, 1992-2003

ST 260, M23 Residuals & Minitab


Unusual point is far from the X-mean mean, but still follows the pattern pattern.

outlier

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 25

Unusual point is far from the XX-mean mean, but does not follow the pattern pattern. Line really twists!

l l ll ll l l l l lll l l

leverage & outlier,

influential
Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 27

Definitions: continued

Influential Case
has an

unusually large effect


on the slope of the least squares line.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 29

Department of ISM, University of Alabama, 1992-2003

l l l l l l l l l l ll l l l

Unusual point does not follow pattern pattern. The line is pulled down and twisted slightly slightly.

l
High leverage

l ll l l l l l l lll l l

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 26

Definitions:

Outlier:
An unusual y-value relative to the pattern of the other cases.

Usually has a large residual.

High Leverage Case: An extreme X value relative


to the other X values.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 28

Definitions: continued Conclusion:

High leverage potentially influential. High leverage Outlier influential!!


Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 30

&

ST 260, M23 Residuals & Minitab

Why do we care about identifying unusual cases?


The least squares regression line is

not resistant
to unusual cases.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 31

n o i s s e Regr ysis Anal tab i in Min


Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 32

Lesson Objectives

Example 3, continued

Can height be predicted using shoe size? Step 1?

Learn two ways to use Minitab to run a regression analysis. Learn how to read output from Minitab.

DTDP
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 34

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 33

Department of ISM, University of Alabama, 1992-2003

ST 260, M23 Residuals & Minitab


Example 3, continued

Can height be predicted using shoe size?


Graph Plot

84 80 76 Height 72 68 64 60
Jitter added in X-direction.
Female Male

Scatterplot

The scatter for 56each subpopulation is 5 about the same; i.e., there is constant variance.

6 7

8 9 10 11 12 13 14 15 Shoe Size
M23- Residuals & Minitab 35

Department of ISM, University of Alabama, 1992-2003

Example 3, continued
Stat

Method 1

Regression Regression

Y = a + bX

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 36

Department of ISM, University of Alabama, 1992-2003

ST 260, M23 Residuals & Minitab


Example 3, continued

Copied from Session Window.

Can height be predicted using shoe size?


Regression Analysis: Height versus Shoe Size The regression equation is Height = 50.5 + 1.87 Shoe Size Predictor Constant Shoe Siz S = 1.947 Coef 50.5230 1.87241 SE Coef 0.5912 0.06033 T 85.45 31.04 P 0.000 0.000

R-Sq = 79.1%

R-Sq(adj) = 79.0%

Analysis of Variance Source DF Regression 1 Error 255 Total 256 SS 3650.0 966.3 4616.3 MS 3650.0 3.8 F P 963.26 0.000

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 37

Example 3, continued

Can height be predicted using shoe size?


Regression Analysis: Height versus Shoe Size The regression equation is Height = 50.5 + 1.87 Shoe Size Predictor Constant Shoe Siz S = 1.947 Coef 50.5230 1.87241 SE Coef 0.5912 0.06033

Least squares estimated T P coefficients. 85.45 0.000


31.04 0.000 R-Sq(adj) = 79.0%

R-Sq = 79.1%

Analysis of Variance Source DF Regression 1 Error 255 Total 256

Total Degrees of Freedom = Number of cases - 1


MS 3650.0 3.8 F P 963.26 0.000

SS 3650.0 966.3 4616.3

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 38

Department of ISM, University of Alabama, 1992-2003

ST 260, M23 Residuals & Minitab


Example 3, continued

Can height be predicted using shoe size?


Regression Analysis: Height versus Shoe Size The regression equation is Height = 50.5 + 1.87 Shoe Size Predictor Constant Shoe Siz S = 1.947 Coef 50.5230 1.87241

SE Coef R-Sq = T = P 0.5912 85.45 TSS 0.000 4616.3 0.06033 31.04 0.000 R-Sq(adj) = 79.0%

SSR

3650.0

R-Sq = 79.1%

Analysis of Variance Source DF Regression 1 Error 255 Total 256 SS 3650.0 966.3 4616.3 MS 3650.0 3.8 F P 963.26 0.000

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 39

Example 3, continued

Can height be predicted using shoe size?


Regression Analysis: Height versus Shoe Size The regression equation is Standard Error Regression. Height = 50.5 +of 1.87 Shoe Size

Measure of variation around Predictor Coef SE Coef Constant 50.5230 line. 0.5912 the regression Shoe Siz 1.87241 0.06033
S = 1.947 R-Sq = 79.1%

T 85.45 31.04

P 0.000 0.000

R-Sq(adj) = 79.0%

S = MSE = 3.8
Analysis of Sum ofVariance Source DF Regression 1 Error 255 Total 256

squared residuals

SS 3650.0 966.3 4616.3

MS 3650.0 3.8

F Squared P Mean 963.26 0.000

Error MSE

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 40

Department of ISM, University of Alabama, 1992-2003

ST 260, M23 Residuals & Minitab


Example 3, continued

Can height be predicted using shoe size?


Residuals Versus Shoe Siz
(response is Height)

Are there any problems visible in this plot? ___________

Residual

-5 5 10 15

Shoe Siz

No Jitter added.
M23- Residuals & Minitab 41

Department of ISM, University of Alabama, 1992-2003

Example 3, continued

Can height be predicted using shoe size?


Least squares regression equation:

Height = 50.52 + 1.872 Shoe


r-square = 79.1%, Std. error = 1.947 inches
The two summary measures

always be given with the equation.


that should
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 42

Department of ISM, University of Alabama, 1992-2003

10

ST 260, M23 Residuals & Minitab


Example 3, continued

Can height be predicted using shoe size?


Stat

Method 2

Regression

This program gives a scatterplot with the regression superimposed on it.

Fitted Line Plot

Y = a + bX

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 43

Example 3, continued

Can height be predicted using shoe size?


Regression Plot
Height = 50.5230 + 1.87241 Shoe Size S = 1.94659 80 R-Sq = 79.1 % R-Sq(adj) = 79.0 %

The fit looks


Height
70

60

10

11

12

13

14

15

Shoe Size

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 44

Department of ISM, University of Alabama, 1992-2003

11

ST 260, M23 Residuals & Minitab


Example 3, continued

Can height be predicted using shoe size?


Regression Analysis: Height versus Shoe Size The regression equation is these Height = 50.5 + 1.87 Shoe Size Predictor Constant Shoe Siz S = 1.947 Coef 50.5230 1.87241 SE Coef 0.5912 0.06033

What information do values provide?


T 85.45 31.04 P 0.000 0.000

R-Sq = 79.1%

R-Sq(adj) = 79.0%

Analysis of Variance Source DF Regression 1 Error 255 Total 256 SS 3650.0 966.3 4616.3 MS 3650.0 3.8 F P 963.26 0.000

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 45

How do you determine if the 1 X-variable is a useful predictor?


Use the

t-statistic or the F-stat.

t measures how many standard


errors the estimated coefficient is from zero. F = t2 for simple regression.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 46

Department of ISM, University of Alabama, 1992-2003

12

ST 260, M23 Residuals & Minitab

How do you determine if the 2 X-variable is a useful predictor?


A P-value is associated with t and F. The further t and F are from zero, in either direction, the smaller the corresponding P-value will be. P-value: a measure of the likelihood that the true coefficient IS ZERO.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 47

If the P-value IS SMALL (typically < 0.10), then conclude: 3 1. It is unlikely that the true coefficient is really zero, and therefore, 2. The X variable IS a useful predictor for the Y variable. Keep the variable! If the P-value is NOT SMALL (i.e., > 0.10), then conclude: 1. For all practical purposes the true coefficient MAY BE ZERO; therefore 2. The X variable IS NOT a useful predictor of the Y variable. Dont use it.
Department of ISM, University of Alabama, 1992-2003 M23- Residuals & Minitab 48

Department of ISM, University of Alabama, 1992-2003

13

ST 260, M23 Residuals & Minitab


Example 3, continued

Could shoeAnalysis: size Regression Height versus Shoe Size have a true t measures how many standard The regression is the estimated coefficient coefficient that equation errors Height = 50.5 + 1.87 Shoe Size is from zero. is actually zero?
Predictor Constant Shoe Siz S = 1.947 Coef 50.5230 1.87241 SE Coef 0.5912 0.06033 T 85.45 31.04 P 0.000 0.000

Can height be predicted using shoe size?

R-Sq = P-value: 79.1% a measure R-Sq(adj) 79.0% of the = likelihood that the true coefficient is zero.

Analysis of Variance The P-value for Shoe

Size IS SMALL (< 0.10).

Conclusion: Source DF SS MS F P Regression 1 size 3650.0 3650.0 963.26 0.000 The shoe coefficient is NOT zero! Error 966.3 3.8 Shoe 255 size 4616.3 IS a useful predictor Total 256
Department of ISM, University of Alabama, 1992-2003

of the mean of height. M23- Residuals & Minitab

49

The logic just explained is

statistical inference.
This will be covered in more detail during the last three weeks of the course.

Department of ISM, University of Alabama, 1992-2003

M23- Residuals & Minitab 50

Department of ISM, University of Alabama, 1992-2003

14

Você também pode gostar