Dufour 1983 R2 W

Coefficients of determination
Jean-Marie Dufour
McGill University
First version: March 1983
Revised: February 2002, July 2011
This version: July 2011
Compiled: November 21, 2011, 11:05
This work was supported by the William Dow Chair in Political Economy (McGill University),
the Bank of Canada

(Research Fellowship), a Guggenheim Fellowship, a Konrad-Adenauer Fellowship (Alexander-von-Humboldt Foundation, Germany), the Canadian Network of Centres of Excellence [program on Mathematics of Information Technology and
Complex Systems (MITACS)], the Natural Sciences and Engineering Research Council of Canada, the Social Sciences
and Humanities Research Council of Canada, and the Fonds de recherche sur la socit et la culture (Qubec).
William Dow Professor of Economics, McGill University, Centre interuniversitaire de recherche en analyse des
organisations (CIRANO), and Centre interuniversitaire de recherche en conomie quantitative (CIREQ). Mailing address:
Department of Economics, McGill University, Leacock Building, Room 519, 855 Sherbrooke Street West, Montral,
Qubec H3A 2T7, Canada. TEL: (1) 514 398 8879; FAX: (1) 514 398 4938; e-mail: jean-marie.dufour@mcgill.ca . Web
page: http://www.jeanmariedufour.com
Contents
1.
2.
3.
4.
Coefficient of determination: R2
Significance tests and R2

2.1. Relation of R2 with a Fisher statistic . . . . . . . . . . . . . . . . . . . . . . .
2.2. General relation between R2 and Fisher tests . . . . . . . . . . . . . . . . . . .
Uncentered coefficient of determination: Re2
2
Adjusted coefficient of determination: R

4.1. Definition and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . .
2
4.2. Criterion for R increase through the omission of an explanatory variable . . . .
2
4.3. Generalized criterion for R increase through the imposition of linear constraints
3
3
4
5
5
5
7
8
5.
Notes on bibliography
10
6.
Chronological list of references
10
1.
Coefficient of determination: R2
Let y = X + be a model that satisfies the assumptions of the classical linear model, where y and
are T 1 vectors, X is a T k matrix and is k 1 coefficient vector. We wish to characterize to
which extent the variables included in X (excluding the constant, if there is one) explain y.
A first method consists in computing R2 , the coefficient of determination, or R = R2 , the
coefficient of multiple correlation. Let
T
y = X , = y y , y = yt /T = i y/T ,
(1.1)
i = (1, 1, . . . , 1) the unit vector of dimension T ,
(1.2)
t=1
SST
(yt y)2 = (y iy) (y iy) , (total sum of squares)
(1.3)
(yt y)2 = (y iy) (y iy) , (regression sum of squares)
(1.4)
(yt yt )2 = (y y) (y y) = , (error sum of squares).
(1.5)
t=1
T
SSR =
t=1
T
SSE =
t=1
We can then define variance estimators as follows:

V (y) = SST /T ,
V (y)
= SSR/T ,
V ( ) = SSE/T .
(1.6)
(1.7)
(1.8)

1.1 Definition R2 = 1 V ( ) /V (y) = 1 (SSE/SST ) .
1.2 Proposition R2 1 .
P ROOF This result is immediate on observing that SSE/SST 0.
1.3 Lemma y y = y y + .
P ROOF We have
y = y + and y = y = 0,
hence
y y = (y + ) (y + ) = y y + y + y + = y y + .
(1.9)
1.4 Proposition If one of the regressors is a constant, then

SST = SSR + SSE,
V (y) = V (y)
+ V ( ) .
P ROOF Let A = IT i (i i)1 i = IT T1 ii . Then, A A = A and

1
Ay = IT ii y = y iy.
T
If one of the regressors is a constant, we have
T
i = t = 0
t=1
hence
1 T
yt
T t=1
1
1
1
i y = i (y ) = i y = y ,
T
T
T
1
A = ii = ,
T
1
Ay = y ii y = y iy ,
T
=
and, using the fact that A = and y = 0,

SST
= (y iy) (y iy) = y A Ay = y Ay
= (y + ) A (y + )
= y Ay + y A + y A + A
= y Ay +
= (Ay)
(Ay)
+ = SSR + SSE .
1.5 Proposition If one of the regressors is a constant,

R2 =
V (y)
SSR
=
V (y) SST
and 0 R2 1 .
P ROOF By the definition of R2 , we have R2 1 and

R2 = 1
V ( ) V (y) V ( ) V (y)
SSR
=
=
=
V (y)
V (y)
V (y) SST
2
hence R2 0.
1.6 Proposition If one of

the regressors is a constant, the empirical correlation between y and y is
non-negative and equal to R2 .
P ROOF The empirical correlation between y and y is defined by
where
C (y, y)
(y, y)
=
1/2
V (y) V (y)
1
1 T
C (y, y)
= (yt y) (yt y) = (Ay) (Ay)
T t=1
T
and A = IT T1 ii . Since one of the regressors is a constant,

A = , Ay = Ay + , (Ay)
= y = 0
and
C (y, y)
=
(y, y)
=
2.
2.1.
1
1
= (Ay)
(Ay + ) (Ay)
(Ay)
= V (y)
,
T
T
1/2

V (y)
V (y)
= R2 0 .

1/2 =
V (y)
V (y) V (y)
Significance tests and R2

Relation of R2 with a Fisher statistic
R2 is descriptive statistic which measures the proportion of the variance of the dependent variable
y explained by suggested explanatory variables (excluding the constant). However, R2 can be related
to a significance test (under the assumptions of the Gaussian classical linear model).
Consider the model
yt = 1 + 2 Xt2 + + k Xtk + t , t = 1, . . . , T.
We wish to test the hypothesis that none of these variables (excluding the constant) should appear
in the equation:
H0 : 2 = 3 = = k = 0 .
3
The Fisher statistic for H0 is

(S S ) /q
F (q, T k)
S / (T k)
F=
where q = k 1, S is the error sum of squares from the estimation of the unconstrained model
: y = X + ,
where X = [i, X2 , . . . , Xk ] and S s the error sum of squares from the estimation of the constrained
model
: y = i 1 + ,
where i = (1, 1, . . . , 1) . We see easily that
S
1 =
S
y X
i i
1

y X = SSE ,
i y =
1 T
yt = y , (under )
T t=1
= (y iy) (y iy) = SST
and
F =
=

SSE
1 SST
/(k 1)
(SST SSE) / (k 1)
=
SSE
SSE/ (T k)
SST / (T k)
R2 / (k 1)
F (k 1, T k) .
(1 R2 ) / (T k)
As R2 increases, F increases.
2.2.
General relation between R2 and Fisher tests
Consider the general linear hypothesis

H0 : C = r
where C : q k, : k 1, r : q 1 and rank(C) = q. The values of R2 for the constrained and
unconstrained models are respectively:
R20 = 1
hence
S
S
, R21 = 1
,
SST
SST

S = 1 R20 SST , S = 1 R21 SST .
The Fisher statistic for testing H0 may thus be written

R21 R20 /q
(S S ) /q

F =
=
S / (T k)
1 R21 / (T k)

T k R21 R20
.
=
q
1 R21
If R21 R20 is large, we tend to reject H0 . If H0 : 2 = 3 = = k = 0, then
q = k 1 , S = SST , R20 = 0
and the formula for F above gets reduced of the one given in section 2.1.
3.
Uncentered coefficient of determination: Re2
Since R2 can take negative values when the model does not contain a constant, R2 has little meaning
in this case. In such situations, we can instead use a coefficient where the values of yt are not
centered around the mean.
3.1 Definition

Re2 = 1 /y y .
R 2 is called the uncentered coefficient of determination on uncentered R2 and R =

uncentered coefficient of multiple correlation.
3.2 Proposition
P ROOF
4.
4.1.
R 2 the
0 R 2 1 .
This follows directly from Lemma 1.3: y y = y y + .
Adjusted coefficient of determination: R
Definition and basic properties
An unattractive property of the R2 coefficient comes form the fact that R2 cannot decrease when
explanatory variables are added to the model, even if these have no relevance. Consequently, choosing to maximize R2 can be misleading. It seems desirable to penalize models that contain too many
variables.
Since
V ( )
R2 = 1
,
V (y)
where
SST
SSE
1 T
1 T
V ( ) =
= t2 , V (y) =
= (yt y)2 ,
T
T t=1
T
T t=1
Theil (1961, p. 213) suggested to replace V ( ) and V (y) by unbiased estimators:

s2 =
s2y =
1 T 2
SSE
=
t ,
T k T k t=1
SST
1 T
=
(yt y)2 .
T 1 T 1 t=1
4.1 Definition R2 adjusted for degrees of freedom is defined by

s2
T 1 SSE
2
R = 1 2 = 1
.
sy
T k SST
4.2 Proposition
P ROOF
2
4.3 Proposition

2
k1
2
2
2 .
R = 1 TT 1
k 1 R = R T k 1 R

T 1 SSE
T 1
= 1
1 R2
= 1
T k SST
T k

k1
T k+k1
2
1R = 1 1+
= 1
1 R2
T k
T k

k1
k1
1 R2 = R2
1 R2 . Q.E.D.
= 1 1 R2
T k
T k
R R2 1 .
P ROOF The result follows from the fact that 1 R2 0 and (4.2).
4.4 Proposition
R = R2
4.5 Proposition
R 0 iff R2
iff (k = 1 or R2 = 1) .
k1
T 1
R can be negative even if R2 0. If the number of explanatory variables is increased, R2 and k

2
both increase, so that R can increase or decrease.
4.6 Remark When several models are compared on the basis of R2 or R , it is important to have the
2
same dependent variable. When the dependent variable (y) is the same, maximizing R is equivalent
to minimizing the standard error of the regression
"
1 T 2
s=
t
T k t=1
4.2.
#1/2
Criterion for R increase through the omission of an explanatory variable
Consider the two models:

yt = 1 Xt1 + + k1 Xt(k1) + t
, t = 1, . . . , T,
yt = 1 Xt1 , + + k1 Xt(k1) + k Xtk + t
, t = 1, . . . , T.
(4.1)
(4.2)
We can then show that the value of R associated with the restricted model (4.1) is larger than the
one of model (4.2) if the t statistic for testing k = 0 is smaller than 1 (in absolute value).
2
4.7 Proposition If Rk1 and Rk are the values of R for models (4.1) and (4.2), then

2
R
1

k
2
2
tk2 1
Rk Rk1 =
(T k + 1)
(4.3)
where tk is the Student t statistic for testing k = 0 in model (4.2), and

2
Rk Rk1
iff tk2 1 iff
|tk | 1 .
If furthermore Rk < 1, then
Rk S Rk1
|tk | S 1 .
iff
P ROOF By definition,
2
Rk = 1
s2k
s2y
and Rk1 = 1
s2k1
s2y
where s2k = SSk / (T k) and s2k1 = SSk1 / (T k + 1) . SSk and SSk1 are the sums of squared
errors for the models with k and k 1 explanatory variables. Since tk2 is the Fisher statistic for testing
k = 0, we have
tk2 =
(SSk1 SSk )
SSk / (T k)

(T k + 1) s2k1 (T k) s2k
s2
k

2
2
(T k + 1) 1 Rk1 (T k) 1 Rk
2
1 Rk
!
2
1 Rk1
(T k)
2
1 Rk

2
s2k = s2y 1 Rk . Consequently,
= (T k + 1)

2
for s2k1 = s2y 1 Rk1
and
2
1 Rk1

t 2 + (T k)
2
k
= 1 Rk
T k+1
and
2

2
2
1 Rk1 1 Rk

t 2 + (T k)
2
k
=
1 Rk
1
T k+1
t2 1

2
k
.
=
1 Rk
T k+1
Rk Rk1 =
4.3.
Generalized criterion for R increase through the imposition of linear constraints
We will now study when the imposition of q linearly independent constraints

H0 : C = r
2
will raise or decrease R , where C : q k, r : q 1 and rank(C) = q. Let RH0 and R be the values
2
of R for the constrained (by H0 ) and unconstrained models, similarly, s20 and s2 are the values of
the corresponding unbiased estimators of the error variance.
4.8 Proposition Let F be the Fisher statistic for testing H0 . Then
s20 s2 =
qs2
(F 1)
T k+q
and
s20 S s2
iff F S 1 .
P ROOF If SS0 and SS are the sum of squared errors for the constrained and unconstrained models,
we have:
SS0
SS
s20 =
and s2 =
.
T k+q
T k
The F statistic may then be written

F =
=
(SS0 SS)/q
SS/ (T k)

(T k + q) s20 (T k) s2
T k + q s20
T k
=
2
2
qs
q
s
q
hence
[qF + (T k)]
,
(T k) + q
q (F 1)
,
= s2
(T k) + q
s20 = s2
s20 s2
and
s20 S s2
iff F S 1 .
4.9 Proposition Let F be the Fisher statistic for testing H0 . Then

2
q
1
R
2
2
R RH0 =
(F 1)
T k+q
and
RH0 T R
iff F S 1 .
P ROOF By definition,
2
RH0 = 1
s20
s2
2
,
.
R
=
1
s2y
s2y
Thus,
2
R RH0
s2 s20
s2y
q
T k+q
9
s2
s2y
(F 1)
=
hence

2
q 1R
T k+q
RH0 T R
iff
(F 1)
F S1.
On taking q = 1, we get property (4.3). If we test an hypothesis of the type

H0 : k = k+1 = = k+l = 0 ,
it is possible that F > 1, while all the statistics |ti | , i = k, . . . , k + l are smaller than 1. This means
2
that R increases when we omit one explanatory variable at a time, but decreases when they are
2
all excluded from the regression. Further, it is also possible that F < 1, but |ti | > 1 for all i: R
increases when all the explanatory variables are simultaneously excluded, but decreases when only
one is excluded.
5.
Notes on bibliography
2
The notion of R was proposed by Theil (1961, p. 213). Several authors have presented detailed
discussions of the different concepts of multiple correlation: for example, Theil (1971, Chap. 4),
2
Schmidt (1976) and Maddala (1977, Sections 8.1, 8.2, 8.3, 8.9). The R concept is criticized by
Pesaran (1974). The mean and bias of R2 were studied by Cramer (1987) in the Gaussian case, and
by Srivastava, Srivastava and Ullah (1995) in some non-Gaussian cases.
6.
Chronological list of references

2
1. Theil (1961, p. 213) _ The R nation was proposed in this book.

2
2. Theil (1971, Chap. 4) _ Detailed discussion of R2 , R and partial correlation.

2
3. Pesaran (1974) _ Critique of R .

4. Schmidt (1976)
2
5. Maddala (1977, Sections 8.1, 8.2, 8.3, 8.9) _ Discussion of R2 and R along with their relation
with hypothesis tests.
6. Hendry and Marshall (1983)
7. Cramer (1987)
8. Ohtani and Hasegawa (1993)
10
9. Srivastava et al. (1995)
11
References
Cramer, J. S. (1987), Mean and variance of R2 in small and moderate samples, Econometric Reviews 35, 253266.
Hendry, D. F. and Marshall, R. C. (1983), On high and low R2 contributions, Oxford Bulletin of
Economics and Statistics 45, 313316.
Maddala, G. S. (1977), Econometrics, McGraw-Hill, New York.
Ohtani, L. and Hasegawa, H. (1993), On small-sample properties of R2 in a linear regression model
with multivariate t errors and proxy variables, Econometric Theory 9, 504515.
Pesaran, M. H. (1974), On the general problem of model selection, Review of Economic Studies
41, 153171.
Schmidt, P. (1976), Econometrics, Marcel Dekker, New York.
Srivastava, A. K., Srivastava, V. K. and Ullah, A. (1995), The coefficient of determination and its
adjusted version in linear regression models, Econometric Reviews 14, 229240.
Theil, H. (1961), Economic Forecasts and Policy, 2nd Edition, North-Holland, Amsterdam.
Theil, H. (1971), Principles of Econometrics, John Wiley & Sons, New York.
12

Dufour 1983 R2 W

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Dufour 1983 R2 W

Enviado por

Direitos autorais:

Formatos disponíveis

Coefficients of determination

the Bank of Canada

Significance tests and R2

Adjusted coefficient of determination: R

Chronological list of references

i = (1, 1, . . . , 1) the unit vector of dimension T ,

(yt y)2 = (y iy) (y iy) , (total sum of squares)

(yt y)2 = (y iy) (y iy) , (regression sum of squares)

(yt yt )2 = (y y) (y y) = , (error sum of squares).

We can then define variance estimators as follows:

P ROOF This result is immediate on observing that SSE/SST 0.

1.4 Proposition If one of the regressors is a constant, then

and, using the fact that A = and y = 0,

1.5 Proposition If one of the regressors is a constant,

P ROOF By the definition of R2 , we have R2 1 and

1.6 Proposition If one of

and A = IT T1 ii . Since one of the regressors is a constant,

Significance tests and R2

The Fisher statistic for H0 is

= (y iy) (y iy) = SST

General relation between R2 and Fisher tests

Consider the general linear hypothesis

The Fisher statistic for testing H0 may thus be written

Uncentered coefficient of determination: Re2

R 2 is called the uncentered coefficient of determination on uncentered R2 and R =

This follows directly from Lemma 1.3: y y = y y + .

Adjusted coefficient of determination: R

Definition and basic properties

Theil (1961, p. 213) suggested to replace V ( ) and V (y) by unbiased estimators:

4.1 Definition R2 adjusted for degrees of freedom is defined by

R can be negative even if R2 0. If the number of explanatory variables is increased, R2 and k

Criterion for R increase through the omission of an explanatory variable

Consider the two models:

yt = 1 Xt1 , + + k1 Xt(k1) + k Xtk + t

where tk is the Student t statistic for testing k = 0 in model (4.2), and

iff tk2 1 iff

If furthermore Rk < 1, then

Generalized criterion for R increase through the imposition of linear constraints

We will now study when the imposition of q linearly independent constraints

The F statistic may then be written

4.9 Proposition Let F be the Fisher statistic for testing H0 . Then

On taking q = 1, we get property (4.3). If we test an hypothesis of the type

Chronological list of references

1. Theil (1961, p. 213) _ The R nation was proposed in this book.

2. Theil (1971, Chap. 4) _ Detailed discussion of R2 , R and partial correlation.

3. Pesaran (1974) _ Critique of R .

9. Srivastava et al. (1995)

Você também pode gostar