Você está na página 1de 11

INFERENCE ON REGRESSION COEFFICIENTS

F. Chiaromonte 1
NORMAL SIMPLE LINEAR REGRESSION MODEL:

yi = β 0 + β1 xi + ε i

xi , i = 1...n fixed (or condition on)


ε i , i = 1...n random errors s.t.
ε i ~ N (0, σ 2 ), ∀i
independent

Under this scenario, we consider inference (standard errors, confidence


interval and testing) for:
β1 = slope

β0 = intercept

F. Chiaromonte 2
INFERENCE FOR THE SLOPE:
n n

∑ ( x − x )( y − y ) ∑ ( x − x ) y
i i i i n
b1 = i =1
n
= i =1
n
= ∑ ki yi
∑ (x − x )
i =1
i
2
∑ (x − x )
i =1
i
2 i =1

( xi − x )
ki = n
yi ~ N ( β 0 + β1 xi , σ 2 )
∑ j
( x
j =1
− x ) 2

It follows that

⎛ ⎞
⎜ ⎟
σ 2
b1 ~ N ⎜ β1 , n ⎟
⎜ 2 ⎟
⎜ ∑ j ( x − x ) ⎟
F. Chiaromonte
⎝ j =1 ⎠ 3
m b)= s2 s
var( 1 n
se(b1 ) =
∑ (x
n
− x) ∑ j
2
j ( x − x ) 2

j =1 j =1

and standard error

b1 − β1
~ tn − 2
se(b1 )

… basis for confidence interval and tests

F. Chiaromonte 4
For instance:

1-α level Confidence Interval for the slope: b1 ± tn − 2 (1 − α / 2) se(b1 )

Testing Ho: β1=0 vs Ha: β1≠0


tn-2

Test statistic
b1
t= ~ tn − 2 under Ho
se(b1 )
-tobs +tobs

P-value = area in the two tails

F. Chiaromonte 5
INFERENCE FOR THE INTERCEPT:

n n n
1
b0 = y − b1 x = ∑ yi − ∑ ki x yi = ∑ ki yi
i =1 n i =1 i =1

 1 ( xi − x )
ki = − n x yi ~ N ( β 0 + β1 xi , σ 2 )
∑ j
n
( x − x ) 2

j =1

It follows that

⎛ ⎡ ⎤⎞
⎜ ⎢1 2 ⎥ ⎟
x
b0 ~ N ⎜ β 0 , σ 2 ⎢ + n ⎥⎟
⎜ ⎢n 2 ⎥⎟
⎜ ⎢ ∑ (x j − x ) ⎥ ⎟
⎝ ⎣ j =1 ⎦⎠
F. Chiaromonte 6
1/ 2
⎡ ⎤ ⎡ ⎤
⎢1 x 2 ⎥ ⎢1 x 2 ⎥
m b ) = s2 ⎢ +
var( ⎥ se(b0 ) = s ⎢ + n ⎥
⎢n 2 ⎥ ⎢n 2⎥
0 n

⎢ ∑ ( x j − x ) ⎥ ⎢ ∑ ( x j − x ) ⎥
⎣ j = 1 ⎦ ⎣ j =1 ⎦

and

standard error
b0 − β 0
~ tn − 2
se(b0 )

… basis for confidence interval and tests

F. Chiaromonte 7
For instance:

1-α level Confidence Interval for the b0 ± tn − 2 (1 − α / 2) se(b0 )


intercept:

Testing Ho: β0=0 vs Ha: β0≠0


tn-2

Test statistic
b0
t= ~ tn − 2 under Ho
se(b0 )
-tobs +tobs

P-value = area in the two tails

F. Chiaromonte 8
SCORES.MTW data set: Regression Analysis: Second versus First

The regression equation is


Second = 22.5 + 0.755 First

Predictor Coef SE Coef T P


Constant 22.47 10.22 2.20 0.036
First 0.7546 0.1417 5.32 0.000

S = 11.5131 R-Sq = 49.4% R-Sq(adj) = 47.7%

Fitted Line Plot


Second = 22.47 + 0.7546 First • Standard errors for the
100 S
R-Sq
11.5131
49.4%
estimates of intercept and slope
R-Sq(adj) 47.7%
90
• Observed values of the t tests
80 statistics (coefficient over se)
Second

70 • Corresponding p-values for Ho:


60
β=0 vs Ha: β≠0, obtained under a
T distribution with dof=n-2
50

40
40 50 60 70 80 90 100
First

F. Chiaromonte 9
95% confidence intervals in Minitab:
Calc > Probability Distributions > t
Inverse cumulative probability
(do not worry about the non centrality parameter, it is 0 as by default)
Degrees of freedom at n-2 = 31-2 =29
Input constant at 1-0.025=0.975

Inverse Cumulative Distribution Function


Student's t distribution with 29 DF
P( X <= x ) x
0.975 2.04523

b0 ± tn − 2 (1 − 0.025) se(b0 ) = 22.47 ± 2.045 × 10.22

b1 ± tn − 2 (1 − 0.025) se(b1 ) = 0.755 ± 2.045 × 0.142


F. Chiaromonte 10
Remarks:

• If errors (and hence y values at given x’s) depart from normality, the student
T’s are not, rigorously speaking, the right reference distributions to use in
inference. But some departure is tolerated, and if n is large asymptotic
normality holds.

• Interpretation of inferences (confidence intervals, p-values) is conditional on


the x levels.

• The spread of the x levels affects the standard errors of slope and intercept
estimates.

F. Chiaromonte 11

Você também pode gostar