Você está na página 1de 14

Coefficients of determination

Jean-Marie Dufour
McGill University
First version: March 1983
Revised: February 2002, July 2011
This version: July 2011
Compiled: November 21, 2011, 11:05

This work was supported by the William Dow Chair in Political Economy (McGill University),

the Bank of Canada


(Research Fellowship), a Guggenheim Fellowship, a Konrad-Adenauer Fellowship (Alexander-von-Humboldt Foundation, Germany), the Canadian Network of Centres of Excellence [program on Mathematics of Information Technology and
Complex Systems (MITACS)], the Natural Sciences and Engineering Research Council of Canada, the Social Sciences
and Humanities Research Council of Canada, and the Fonds de recherche sur la socit et la culture (Qubec).
William Dow Professor of Economics, McGill University, Centre interuniversitaire de recherche en analyse des
organisations (CIRANO), and Centre interuniversitaire de recherche en conomie quantitative (CIREQ). Mailing address:
Department of Economics, McGill University, Leacock Building, Room 519, 855 Sherbrooke Street West, Montral,
Qubec H3A 2T7, Canada. TEL: (1) 514 398 8879; FAX: (1) 514 398 4938; e-mail: jean-marie.dufour@mcgill.ca . Web
page: http://www.jeanmariedufour.com

Contents
1.
2.

3.
4.

Coefficient of determination: R2

Significance tests and R2


2.1. Relation of R2 with a Fisher statistic . . . . . . . . . . . . . . . . . . . . . . .
2.2. General relation between R2 and Fisher tests . . . . . . . . . . . . . . . . . . .
Uncentered coefficient of determination: Re2
2

Adjusted coefficient of determination: R


4.1. Definition and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . .
2
4.2. Criterion for R increase through the omission of an explanatory variable . . . .
2
4.3. Generalized criterion for R increase through the imposition of linear constraints

3
3
4
5
5
5
7
8

5.

Notes on bibliography

10

6.

Chronological list of references

10

1.

Coefficient of determination: R2

Let y = X + be a model that satisfies the assumptions of the classical linear model, where y and
are T 1 vectors, X is a T k matrix and is k 1 coefficient vector. We wish to characterize to
which extent the variables included in X (excluding the constant, if there is one) explain y.
A first method consists in computing R2 , the coefficient of determination, or R = R2 , the
coefficient of multiple correlation. Let
T

y = X , = y y , y = yt /T = i y/T ,

(1.1)

i = (1, 1, . . . , 1) the unit vector of dimension T ,

(1.2)

t=1

SST

(yt y)2 = (y iy) (y iy) , (total sum of squares)

(1.3)

(yt y)2 = (y iy) (y iy) , (regression sum of squares)

(1.4)

(yt yt )2 = (y y) (y y) = , (error sum of squares).

(1.5)

t=1
T

SSR =

t=1
T

SSE =

t=1

We can then define variance estimators as follows:


V (y) = SST /T ,
V (y)
= SSR/T ,
V ( ) = SSE/T .

(1.6)
(1.7)
(1.8)


1.1 Definition R2 = 1 V ( ) /V (y) = 1 (SSE/SST ) .
1.2 Proposition R2 1 .

P ROOF This result is immediate on observing that SSE/SST 0.

1.3 Lemma y y = y y + .
P ROOF We have
y = y + and y = y = 0,
hence
y y = (y + ) (y + ) = y y + y + y + = y y + .

(1.9)

1.4 Proposition If one of the regressors is a constant, then


SST = SSR + SSE,
V (y) = V (y)
+ V ( ) .
P ROOF Let A = IT i (i i)1 i = IT T1 ii . Then, A A = A and


1
Ay = IT ii y = y iy.
T
If one of the regressors is a constant, we have
T

i = t = 0
t=1

hence
1 T
yt
T t=1

1
1
1
i y = i (y ) = i y = y ,
T
T
T
1
A = ii = ,
T
1
Ay = y ii y = y iy ,
T
=

and, using the fact that A = and y = 0,


SST

= (y iy) (y iy) = y A Ay = y Ay
= (y + ) A (y + )
= y Ay + y A + y A + A
= y Ay +
= (Ay)
(Ay)
+ = SSR + SSE .

1.5 Proposition If one of the regressors is a constant,


R2 =

V (y)

SSR
=

V (y) SST

and 0 R2 1 .

P ROOF By the definition of R2 , we have R2 1 and


R2 = 1

V ( ) V (y) V ( ) V (y)

SSR
=
=
=
V (y)
V (y)
V (y) SST
2

hence R2 0.

1.6 Proposition If one of


the regressors is a constant, the empirical correlation between y and y is
non-negative and equal to R2 .
P ROOF The empirical correlation between y and y is defined by

where

C (y, y)

(y, y)
=
1/2
V (y) V (y)

1
1 T
C (y, y)
= (yt y) (yt y) = (Ay) (Ay)

T t=1
T

and A = IT T1 ii . Since one of the regressors is a constant,


A = , Ay = Ay + , (Ay)
= y = 0
and
C (y, y)
=

(y, y)
=

2.
2.1.

1
1
= (Ay)
(Ay + ) (Ay)
(Ay)
= V (y)
,
T
T
1/2

V (y)

V (y)

= R2 0 .

1/2 =
V (y)
V (y) V (y)

Significance tests and R2


Relation of R2 with a Fisher statistic

R2 is descriptive statistic which measures the proportion of the variance of the dependent variable
y explained by suggested explanatory variables (excluding the constant). However, R2 can be related
to a significance test (under the assumptions of the Gaussian classical linear model).
Consider the model
yt = 1 + 2 Xt2 + + k Xtk + t , t = 1, . . . , T.
We wish to test the hypothesis that none of these variables (excluding the constant) should appear
in the equation:
H0 : 2 = 3 = = k = 0 .
3

The Fisher statistic for H0 is


(S S ) /q
F (q, T k)
S / (T k)

F=

where q = k 1, S is the error sum of squares from the estimation of the unconstrained model

: y = X + ,
where X = [i, X2 , . . . , Xk ] and S s the error sum of squares from the estimation of the constrained
model
: y = i 1 + ,
where i = (1, 1, . . . , 1) . We see easily that
S

1 =
S

y X

i i

1

 

y X = SSE ,

i y =

1 T
yt = y , (under )
T t=1

= (y iy) (y iy) = SST

and
F =
=



SSE
1 SST
/(k 1)
(SST SSE) / (k 1)
=
SSE
SSE/ (T k)
SST / (T k)

R2 / (k 1)
F (k 1, T k) .
(1 R2 ) / (T k)

As R2 increases, F increases.

2.2.

General relation between R2 and Fisher tests

Consider the general linear hypothesis


H0 : C = r
where C : q k, : k 1, r : q 1 and rank(C) = q. The values of R2 for the constrained and
unconstrained models are respectively:
R20 = 1
hence

S
S
, R21 = 1
,
SST
SST



S = 1 R20 SST , S = 1 R21 SST .

The Fisher statistic for testing H0 may thus be written



R21 R20 /q
(S S ) /q

F =
=
S / (T k)
1 R21 / (T k)


T k R21 R20
.
=
q
1 R21
If R21 R20 is large, we tend to reject H0 . If H0 : 2 = 3 = = k = 0, then
q = k 1 , S = SST , R20 = 0
and the formula for F above gets reduced of the one given in section 2.1.

3.

Uncentered coefficient of determination: Re2

Since R2 can take negative values when the model does not contain a constant, R2 has little meaning
in this case. In such situations, we can instead use a coefficient where the values of yt are not
centered around the mean.
3.1 Definition


Re2 = 1 /y y .

R 2 is called the uncentered coefficient of determination on uncentered R2 and R =


uncentered coefficient of multiple correlation.
3.2 Proposition
P ROOF

4.
4.1.

R 2 the

0 R 2 1 .

This follows directly from Lemma 1.3: y y = y y + .

Adjusted coefficient of determination: R

Definition and basic properties

An unattractive property of the R2 coefficient comes form the fact that R2 cannot decrease when
explanatory variables are added to the model, even if these have no relevance. Consequently, choosing to maximize R2 can be misleading. It seems desirable to penalize models that contain too many
variables.
Since
V ( )
R2 = 1
,
V (y)

where

SST
SSE
1 T
1 T
V ( ) =
= t2 , V (y) =
= (yt y)2 ,
T
T t=1
T
T t=1

Theil (1961, p. 213) suggested to replace V ( ) and V (y) by unbiased estimators:


s2 =
s2y =

1 T 2
SSE
=
t ,
T k T k t=1

SST
1 T
=
(yt y)2 .
T 1 T 1 t=1

4.1 Definition R2 adjusted for degrees of freedom is defined by




s2
T 1 SSE
2
R = 1 2 = 1
.
sy
T k SST
4.2 Proposition
P ROOF
2

4.3 Proposition



2
k1
2
2
2 .
R = 1 TT 1
k 1 R = R T k 1 R



T 1 SSE
T 1
= 1
1 R2
= 1
T k SST
T k




k1
T k+k1
2
1R = 1 1+
= 1
1 R2
T k
T k



k1
k1
1 R2 = R2
1 R2 . Q.E.D.
= 1 1 R2
T k
T k

R R2 1 .

P ROOF The result follows from the fact that 1 R2 0 and (4.2).

4.4 Proposition

R = R2

4.5 Proposition

R 0 iff R2

iff (k = 1 or R2 = 1) .
k1
T 1

R can be negative even if R2 0. If the number of explanatory variables is increased, R2 and k


2
both increase, so that R can increase or decrease.

4.6 Remark When several models are compared on the basis of R2 or R , it is important to have the
2
same dependent variable. When the dependent variable (y) is the same, maximizing R is equivalent
to minimizing the standard error of the regression
"

1 T 2
s=
t
T k t=1

4.2.

#1/2

Criterion for R increase through the omission of an explanatory variable

Consider the two models:


yt = 1 Xt1 + + k1 Xt(k1) + t

, t = 1, . . . , T,

yt = 1 Xt1 , + + k1 Xt(k1) + k Xtk + t

, t = 1, . . . , T.

(4.1)
(4.2)

We can then show that the value of R associated with the restricted model (4.1) is larger than the
one of model (4.2) if the t statistic for testing k = 0 is smaller than 1 (in absolute value).
2

4.7 Proposition If Rk1 and Rk are the values of R for models (4.1) and (4.2), then


2
R
1


k
2
2
tk2 1
Rk Rk1 =
(T k + 1)

(4.3)

where tk is the Student t statistic for testing k = 0 in model (4.2), and


2

Rk Rk1

iff tk2 1 iff

|tk | 1 .

If furthermore Rk < 1, then

Rk S Rk1

|tk | S 1 .

iff

P ROOF By definition,
2

Rk = 1

s2k
s2y

and Rk1 = 1

s2k1
s2y

where s2k = SSk / (T k) and s2k1 = SSk1 / (T k + 1) . SSk and SSk1 are the sums of squared
errors for the models with k and k 1 explanatory variables. Since tk2 is the Fisher statistic for testing
k = 0, we have
tk2 =

(SSk1 SSk )
SSk / (T k)



(T k + 1) s2k1 (T k) s2k
s2
 k



2
2
(T k + 1) 1 Rk1 (T k) 1 Rk
2

1 Rk
!
2
1 Rk1

(T k)
2
1 Rk


2
s2k = s2y 1 Rk . Consequently,

= (T k + 1)


2
for s2k1 = s2y 1 Rk1

and

2
1 Rk1


 t 2 + (T k)
2
k
= 1 Rk
T k+1

and
2

 

2
2
1 Rk1 1 Rk


  t 2 + (T k)
2
k
=
1 Rk
1
T k+1
  t2 1 

2
k
.
=
1 Rk
T k+1

Rk Rk1 =

4.3.

Generalized criterion for R increase through the imposition of linear constraints

We will now study when the imposition of q linearly independent constraints


H0 : C = r
2

will raise or decrease R , where C : q k, r : q 1 and rank(C) = q. Let RH0 and R be the values
2

of R for the constrained (by H0 ) and unconstrained models, similarly, s20 and s2 are the values of
the corresponding unbiased estimators of the error variance.
4.8 Proposition Let F be the Fisher statistic for testing H0 . Then
s20 s2 =

qs2
(F 1)
T k+q

and
s20 S s2

iff F S 1 .

P ROOF If SS0 and SS are the sum of squared errors for the constrained and unconstrained models,
we have:
SS0
SS
s20 =
and s2 =
.
T k+q
T k

The F statistic may then be written


F =
=

(SS0 SS)/q
SS/ (T k)


 
(T k + q) s20 (T k) s2
T k + q s20
T k
=

2
2
qs
q
s
q

hence
[qF + (T k)]
,
(T k) + q
q (F 1)
,
= s2
(T k) + q

s20 = s2
s20 s2
and
s20 S s2

iff F S 1 .

4.9 Proposition Let F be the Fisher statistic for testing H0 . Then




2
q
1

R
2
2
R RH0 =
(F 1)
T k+q

and

RH0 T R

iff F S 1 .

P ROOF By definition,
2

RH0 = 1

s20
s2
2
,
.
R
=
1

s2y
s2y

Thus,
2

R RH0

s2 s20
s2y

q
T k+q
9

s2
s2y

(F 1)

=
hence



2
q 1R
T k+q

RH0 T R

iff

(F 1)

F S1.

On taking q = 1, we get property (4.3). If we test an hypothesis of the type


H0 : k = k+1 = = k+l = 0 ,
it is possible that F > 1, while all the statistics |ti | , i = k, . . . , k + l are smaller than 1. This means
2
that R increases when we omit one explanatory variable at a time, but decreases when they are
2
all excluded from the regression. Further, it is also possible that F < 1, but |ti | > 1 for all i: R
increases when all the explanatory variables are simultaneously excluded, but decreases when only
one is excluded.

5.

Notes on bibliography
2

The notion of R was proposed by Theil (1961, p. 213). Several authors have presented detailed
discussions of the different concepts of multiple correlation: for example, Theil (1971, Chap. 4),
2
Schmidt (1976) and Maddala (1977, Sections 8.1, 8.2, 8.3, 8.9). The R concept is criticized by
Pesaran (1974). The mean and bias of R2 were studied by Cramer (1987) in the Gaussian case, and
by Srivastava, Srivastava and Ullah (1995) in some non-Gaussian cases.

6.

Chronological list of references


2

1. Theil (1961, p. 213) _ The R nation was proposed in this book.


2

2. Theil (1971, Chap. 4) _ Detailed discussion of R2 , R and partial correlation.


2

3. Pesaran (1974) _ Critique of R .


4. Schmidt (1976)
2

5. Maddala (1977, Sections 8.1, 8.2, 8.3, 8.9) _ Discussion of R2 and R along with their relation
with hypothesis tests.
6. Hendry and Marshall (1983)
7. Cramer (1987)
8. Ohtani and Hasegawa (1993)
10

9. Srivastava et al. (1995)

11

References
Cramer, J. S. (1987), Mean and variance of R2 in small and moderate samples, Econometric Reviews 35, 253266.
Hendry, D. F. and Marshall, R. C. (1983), On high and low R2 contributions, Oxford Bulletin of
Economics and Statistics 45, 313316.
Maddala, G. S. (1977), Econometrics, McGraw-Hill, New York.
Ohtani, L. and Hasegawa, H. (1993), On small-sample properties of R2 in a linear regression model
with multivariate t errors and proxy variables, Econometric Theory 9, 504515.
Pesaran, M. H. (1974), On the general problem of model selection, Review of Economic Studies
41, 153171.
Schmidt, P. (1976), Econometrics, Marcel Dekker, New York.
Srivastava, A. K., Srivastava, V. K. and Ullah, A. (1995), The coefficient of determination and its
adjusted version in linear regression models, Econometric Reviews 14, 229240.
Theil, H. (1961), Economic Forecasts and Policy, 2nd Edition, North-Holland, Amsterdam.
Theil, H. (1971), Principles of Econometrics, John Wiley & Sons, New York.

12

Você também pode gostar