Você está na página 1de 31

26

Introductory Econometrics

for

Finance

Appendix: Econometric software package suppliers


P a ckq gg,.,,',,

C,ontafi inform atio n


QMS Sofnvare, Sr"rire 336,4521 Campus Drive #336. hvine. CA 92612-2621, USA Tel: {+1) 949 8s6 3368; Fax: (+1) g4g 8s6 2044: WeD: \ew,evlews.conl

EVicws CAtISS
LIMDEP

Aptech Systcms Inc. pO Box 250, BIack DiaIton.l, WA 98010, USA


Tel: (+l ) 425 432 7855: Fax: (+1) 425 432 j832., Web:

rief overview of the classical linear gression model

lrw.rprech.conr
USA

Econometric Softwarc,
Tel:

.15

Gloria l,lace, plainview, Ny 11803,

{+t)

516 938 5254; Fax:

(+l)

516 g3B 2441i Wc.b:

ww,liDdep.con
01760-2098, USA

MATLAB The lr{arhWorks Inc., 3 Applie Hill Drive, Natick, MA


Ie1:

1+l) 508 647 7000: Fax: {+l)

SO8

647

7OO1: Wc"b:

ww.rrarhworl<s.conr

RATS SAS

Learning Outcomes
In this chapter', you will lrnrtr how to e Derive the OLS foLt.uulae fol cstirllatirlg Pafallletcrs ancl theit'
standard errors

Estinra. l560 Sherman Avenue, Evanson, IL 60201, LISA Tel: (+1)847 864 8772i Fax: {+1) 847 864 6221: Web: M.estjnla.conl
SAS

Tel:

(+t)919 677

Institure, 100 Campus Drive, Cary NC 27513_2414, USA B00O: Fax: (+t) gtg 67j 4444., Web: www.sas.conr

q
q
&

'have

Explair-r

the dcsilablc pl'ol)ctties thltt a good estil'nator shotlld

SIIAZAM Northwesr Econometrics Ltd., 2i7 Arbutus Reach, Gibsons, Il.C. VoN tV8,
Canada Tel:

-;

Fax:

l+1) 707

317 5364: Web: shazam.econ.ubc.ca 1700 wesrlake Averue

sPLUs

Discuss the flactot's that affect the sizes ofstandald ertors Test hypotheses Lrsing the test of significance atld conficlence interval apProaches

Insightful corporation,
98109-3044, USA

Norilr, sr.rite 500, seattlc, wA

Interplet p-values
Estinate t'egression models and test single hypotheses irt
EViews

Tel: {+1) 206 283 8E02; Fax: (+1)206 283 8691; Web: mu,,spius.conr
SPSS

233 S. Wacker Drive, 11rh Floor, Chicago, IL 60606_6307, USA Te1: (+1) 800 543 2tBS: Fax: (+1) 800 841 OOG4r Web: m.spss.com
SPSS 1nc,

TSP

TSP International, PO Box 6i015 Statiotl A, palo Alto, CA 94306, USA Tel: (+1)650 326 1927: Fax: (+1)650 328 4163; Web: www,tspintl.com

2.1

What is a re$ression model?


the Regressiou analysis is alrlost certainly the nost impoltant tool at analysis? ln very genet'al econometrician's disposal. But what is regression terms, regression is concerned with clesclibing and evaluating the l'eldtiorlMo|e specificaily, sl1ip bctweel.L a gittcn t,ariable and Tne or nlore otller variables.

Key concepts
The key terms to be able to define and explain from thjs chapter are s financial econometrics c continuously compounded returns o time series o cross-sectional data o panel data o pooled data @ continuous data ls discrete data

regressionisanatten-}pttoexplairrl]lovementsitravar.iablebyrefcretrce to novenents itl one ol tr.tofe other variables To urake this rrlore concrete, dellote the variable whose tnowentents

theregressionsee]<stocxplairrbyr,arrdtl-rcvariablesrviricharettsedto expiain those variations bY lr, r:'. ., rt Hence, in this relatively simple

setup, it wor.rlcl be saicl that Variatiotrs in t variables (the \ s) canse changes in sonte othcr.variabic, r'. lltis chapler rviil be limited to tllc casc whcf(' thc nrodci seei(s to explain changes in oirly olle valiablc r (altllor,rgh tirls

lcstlictiotr will be rcurovcd in chaptct'6).


27

28

lnl rodu(tory Economettics

Ior

Finance

A bricf

r'cglcssiotr nttrrlcl ovewiew o/ the clas'sicai llncnt'

Fr$fixf
y Dependent variable Regressand Effect variable Explained variable
Names for

Names For the Regressors

rs

Scatter Plot of two variables, Y and x

lndependent variables
Causal variables Explanatory variabtes

Thele ale various completely interchangeable nanres for. .y and the


.rs, and all

of these terms will be

r.rsed synonyr-rlously

in this book

(see

box

2.1).

2.2

Regression versus correlation


readers will be au'are of the notion and definition of corrclaLion. The colrelation befween rwo variables measures the dcgi'r'e oJ linear.ts.socidrion benl'een thent. If it is stated that r, and .v are cor.related, it means th;rt r.

All

rnd .r are being treated in a con-rpletely symmetrical way. Thus, it is not implicd that changes in.t cause changes in r,, or.indeed that cltanges il.r r.cause changes in.r'. Rather, it is simply stated that there is evidence for a linear relationship beLr,veen the fwo variables, and that movements

t"tsk rettll'lls var-y with thcir levcl of t'uat'ket between stock prlces ,* Measuriug the Iong-terltr l'elatiotlsl-tip

e How

asset

ir

dividends

Constructillg an oPtilllal hedge t'atto'

in the two are on


coefficient.

a\rerage related

to an extent given by the correlation

Suppose

In regression. the dependenl variable l.r.r and the indcpendenr variare treated very differently. The r, variable is assumed to be random or'stochastic'in some way, i.e. to have a probability distribution. The .r variables are, however, assumed to have fixed ('non-stochastic') values in repeated sanrples.l Regression as a tool is more flexible and more nowerfirl tha n corr'4]31jgp.
able(s) (.ts)

be a relati that a researcher has solDe iciea that there should theory suggt ,iip o.*.." ftvo variables l and \ ' anci that fin:rncial in A sensible first st that an increase tn .t wlll leacl to an increase 'r" associatiot't befween the varial to testing whether there is indeed an of t Suppose that tire otttcoure would be to fornl a scatter plot of them'

plot is figlrre 2.1 positive lill In this case, it appeals that there is an approxirllate ltstt'

2.3

Simple regression
Fol sinrpliciry, suppose for now rhar ir is believed that.r'depends on only one .r variable. Again. this is of course a severely lestricted case, but the case of more explanatory variables n'ill be considered in the next cirapter. Three exanrplcs of the kjnd of relarionship rhar rrray be of inrer-cst include:

rneans that incl'eases in r aLe relationship befween t and -r'which that the relationsirip befween tl: accompanieci by increases in r'' and a straigirt line' It would be poss .on fr. describeci appl'oxilrrately by that appears to fit the data' to ciraw by hand onto the grapl-r a line t' line fittec1 by eye could then be lneasurecl i.'rr.r..p, and slope of the to be labor; sttch a metltod is lil<ely tl-re graph. However, rn practice and inaccut'ate. to what extel'lt this It rvould thet'efore bc of interest to de termine

that can be estilrated lislllg ' tionship can be desct'ibed by an equatioll r'rse the general eqtlatioll for a stllight It ls possible to
fi:red proccclr.rre.

Strictly. the assuntptiotr rhrt the rs are non"srochastic is stroncer issne thit will be discuss(.d jn ntore detaji in chlpter.4.

thrn required. rn

r'=u*i.r

30

Introductory Econometics

for

Finance

A brief

oven'taw

of tltt:

classical linectr rcg,ression modcl

@
a
Even in the generar case where there is more than one expranatory variabre, some determinants of 11 wiil arways in practice be omitted rrom tne moout. rnis ,'ght, t",' example, arise because the number of influences on is too large to place in a )j single model, or because some determinants of y may be unobservable or not measurable. There may be errors in the way that is measured which cannot ) be modelled. There are bound to be random outside influences on that agatn cannot be ;, moderred' For exampre, a terrorist attack, a hurricane or a computer failure courd ail affect financiar asset returns in a way that cannot be captured In a moder and cannot be forecast reliably. Similarly, many researchers would argue that human behaviour has an inherent randomness and unpredictabilityl
eye

Scatter Plot of two variables wlth a llne of best fit chosen DY

.
o

to get the line that best 'fits'the data. The rcsearcher. wourcl the. be seeking to find the values of the pal'aneters or coefficieuts, * a'd lJ, which would place the line as close as possible to all of the data points tal(en together. However, this equation (1, = a + B.r) is an exact one. Assumir1g that this equation is appropriate, ifthe values ofa and had been calculated, B tl.ren given a value of ,r, it would be possibre to deternrine with certai'ty what the value of would be. In'ragine - a moder which says with complete ' certainry what the value of one variable will be given any value of the other! clearly this nrodel is not rearistic. statisticaily, it wourd corr.espond to the case where rhe model fitted rhe data perfectiy- that is, ail of trre data points lay exactly on a straight line. To make the moclel nore realistic, a random disturbance te.n, denoted by,, is added to the equation, thus

of the assunption that r is fi-xecl in repeated samples' model problem becomes one of detern-rining the appropriate the of r' giu"n 1or conditional upon) the observed values
as a resttlt

so-t)
for

indicative resu This 'eye'balling' procedure nray be acceptable if only method' as well as being tedious' is lik nre required, but of course this to flt a line to the dat; to be imprecise. The most con'llrloll method used fonns the wolkho known as ordinary least squares (OLS) This approach in detail in t of ecouonetlic nrodel estimation, and will be discussed and subsequent chaPters' the apProprr Two alternatlve estlmation methods (for deterl-nining p) are ti,e method of tnourents and values of the coefficlents cr anci of the method method of maxinrum likelihood' A generalised versiott of t Hansen (1982), is popular' but beyond the scope moments, due to widely emplol'ed' ' is also book. The nethod of maximum likelihood will be discnssed in detail in chapter 8' of data cont; Suppose now fot' ease of exposition' that the satnple of OLS entails taking each vert only-fiu" observations. The metl'rod it and tilen uriniurtr distance fronl the polnt to the line' squaring the areas of squares (l]ence 'least sqtrares )' as sltt]wi the total sum of the stllll ol figure 2.3. This can be vierved as eqrtivalent to tnininrising linc' the aieas of the squares dlawn fi'ou the points to data poirlt for Tightening up the notation, let r', clenote tl're actllal fitted valrte front the regresston linc t and let ii cienote the
scFr'ation

-l'/=a+F.r,*u,

(2.2)

where the subscript t(= 1, 2,3, ...) denotes the observation number. The disturbance term can capture a number of features (see box 2.2). So how are the appropriate values ofo and p deterrnined? a and I are chosen so that the (verticai) distances from the data points to trre firted lines are minimised (so that the line fits the data as closely as possible). The parameters are thus chosen to minimise colectivery trre (ver.ticar) distances fi'orr the data poinrs to the fitted line. This courd be crone by 'eye-balling' the data and, for each set of variables and .r, oue courcr ' form a scatter plot and draw on a iine that looks as ifit fits the clat;r rvell by hand, as in figur.e 2.2. Note that the ve'ficrri distcrrccs are usually mi'inrsecr r.ather tha, the ho.izontal distances or those taken perpendicular to the line. This arises

JZ

ntrod.uctory Econontetrics for

inance

brief oven'icw oJ tlta classicul lincar rcg,ressiott

tttodcl

3:i

N4ethod of OLS fitting a line to the data by minimlsing the sum of squared residuals

@il

above

count as

tl-rose below woulcl the line wor'tlcl collllt as pol'itivc valttcs' while in large part caucel each othet negatives. So these clist:rnccs will

ottt'wlriclrwoulcltlleant.hltollecor'rlclfitvir.ttra.Ilyatrylitletothedat:l line and tl-ti


so long as

tlle stllll of the rlistancc's of the l)oillts above the the lille wet'e the sanre ln tha sunr of thc distrlnces of the poit-rts bclow soltltiott fo| the cstirtra:cd:oeffict:lll case, thel'e rvottltl tlot ue a r'rrlirlue olrsetvatton fittcd line that goes tlttor'rgl'r thc tttcau of tl-rc In fact, any taking the sqllat(" (i,e. i, r)wottlcl sct tllc sttlll Of the ri, to zcro Horvcvet" clltel'the t;tlctrlrtitln :r|e positir clistances cltsufes tilllt:tll rleviatiorls thilt
ancl thet'efot'e do

EEil

not c:tt'tcc'l ortt' Sorr-rirrirr-risir-rgthcsu]llofsclttat.cclclistatrcesisgivclrbytl-til.rilrrisirl ri?] + ri + /ii + tl + rr]t, ol mir.tituisitrg

Plot of a single observati on, together with the line of best fit, the resrdual and the fitted value

{t,i;l
T.his s.n-r

is.l<'ow'

as

the rcsiriuol

t'esirluals. Bnt what is ir,'?

point ancl the line, r, ing f, (ii - j,):.^

- i,

rcttl' Again' it is the clifference bctween the lninimising l, rri is cquivaletrt to llllllllllr So

.srrirr ,J'.sqrrirrcs (RSS)

or the

su''

Of

Squr'c

selected by nrinin"rising tl Letting d arlcl li deuote the valries ofo alld B fol the fitted line is given t'y i' = d -i. RSS, respectively' the cquation knou'n as a loss functiou Tal Now let L denote tl-re RSS' which is also i'e fron t = I to I' wlrere the sunnlation over all of the obsewatior-rs'

is the nttnlber of observaliotts


7'

= |,1.r',
,- l

-i,)r

f tr, -ti -d'')t r=l

(2

other words, for the given value of .r'of this observation /, i, is the vah.re l which the model would have predicted. Note that a hat (^) over a variable or parameter is used to denote a value estimated by a r'oclei. Finally, let ri, denote the residual, which is the difference between the actual value of and the vaiue firted by the model for this data point * ' i.e, (ti * i,). Tliis is shown for just one observation r in figur.e 2.4. \Mrat is done is to nlinimise the sum of the ni. The reason that the sr,ln.l of the squared distances is minimised rather than, for. exa'rple, finding the sun of ri, that is as close !o zero as possible, is that i' the latter case some poinrs rvill lie above the line while others lie below it. Then, when the sunr to be made as close to zero as possible is for-r'ned, the points

for

q and p' to fincl tlle values of al-tcl L is mirrin-rised with respect to (w l''t ) t? to give the line that is clos' which n-riniurise the resldual sumof sqttares w.r.t. a and p, settirrg tlre fir.st del.ivattl to the data. So L is differentiated squares (OLS) estinlatoi' ls g1\ to zero. A cler-ivation of the orclinary least

estirrrrtors fol' the in the appendix to this chapter' The coefficient and the illterccpt are glven
DY

slt

..,1,

- rt'

\2.4)

a=i-iJ.{
otril'the

1,

i
I

Equatiorls (2.4) and (2'5) stiltc th;tt' givclr vaiues of tltc tr'r'o p:lriltllcl' an,i r,. it is always possible to calculate the Equatiorr (2 4) is lhc crlsiest firltlr d rrntl fl, that best fit thc sct of clrta

scts of obsc':'valtotr

_--- -

, -.

.,

r:

I f | |

lt

JA

Intrcductory Econometrics for Finance

A brieJ oveniew

oJ the clossical

lincar rcgrcssitm model

Table

2.1

Sample data on fund XXX to motivate OLS estimation


Excess Year, r
1

IT'?'rr9il plot of
rf,

return

fund XXX
L7.8

on
rxxx,r

- rfr

return on market index = nn,


Excess
'13.7

Scatter excess returns on fund XXX versus excess returns on the n.rarket Portfolio

'1{}

o o

xl0
* ll

2 3 4 5

39.0
72.8

)a)
1.7.2

23.2 6.9
16.8 't2.3

oo
o
o ttl

to use to calcnlate the slope estintate, but the forrrula can also be written nrole intuitively, as
/1

l5

a(-\-r -.1

)(

\'f

-.\')
12.6)

Excess return on market Portfolio

L(.r/

.\')-

,r. and l divided bv the sample vaLiance of .r. To reiterate, this method of finding the optimurn is known as OLS. It is also worth noting that it is obvious from the equation for t1, that the regression line wiil go through the mean of the observations - i.e, that the point (.i,.y) lies on the regression line.

which is equivalent to the sample covariance befween

2,3.7 What

are

and

fi

used for?

posing another questioll lf ' This question is plobably best ausu'ercc1 by 20% higl' tells you that she expects the lnarket to yield a return ",r"lyit r.isk-free rate next yeaf, what would yoll expect the leturn ' than the fund XXX to be?
The exPected value

of

\'

:'-7'74 :
3l'06

+ 1.64 x vah,re of .{', so Plug t =

Example 2.1 Suppose that some data have been coilected on the excess returns on a XXX'1 rogether with the excess returns on a market index as shown in table 2.1,

into fund manager's portfolio ('fund

(2.7)

ir = -1,14 +

1.64 x 20

Tire fund manager has some intuition that the beta (in the CAPM framework) on this fund is positive, and she therefore wants to find whether there appears to be a relationship between x and y given the data. Again, the first stage could be to form a scatter plot of the fwo variables (figure 2,5). Clearly, there appears to be a positive, approximately linear relationship berween .r and .r,, although there is not much data on which to base this conclusionl Plugging the five observations in to rnake up the formulae given in Q.a) and (2.5) would lead to the estirnates d = -1.74 and
E

Tlrus, fbr a glven expecred marl(et risl< premium an excess over the rj riskiness, ftrnd XXX would be expected to earn the regression beta is :t free rate of approximately 31%' h'r this setup' beta of 1 64' s the CAPM beta, so that fund XXX has an estimated In this case' the residual sun gesting that the fund is rather rislqr' OLS coeffici' t'eaches its minimum value of 30'33 with these

of

2a%' and given

iqu"ro
values.

be obvior'rs, it is wolth stating that it is not actvlsir Thns to conduct a regressiot't analysis using only five obsewationsl

Although it

n-ray

resultspresentedherecanbeconsiclereclindicativeandforillustrattot
the techniqr-re only.
Son-re

L6a. The fitted line would be rvritten as

further discussions on appropriate saniple

s:

.i,

= -1.74*

I.64.r,

(2.7)

where .r', is the excess return of the market portfolio over tlte r.isk fi.ee rate (i.e. rnt * tf1, also known as the n?arket risk preniunt.

for regression analysis are given in chapter 4' as saying that The cocfficient estlnlate of 1.64 for 19 is iuterpreted else being cq .r'increases by 1 ul'rit. r'rvrll be expectcci' everythitrg negative' a nse to increase by 1.64 urrits'. Of coLlrse' if i had beell coefficient estilllat{ tvoukl on a\/erage car.ise a fall in r" a, the intelcePt
I

36

lntroductory Econonrctdcs for Flnonce

A brief ovtn'iew o/
Ir-r

thc cias.sical lit'nar rcgrcssion

modcl

'

E@il

of such cases, one coulcl not e?pcct to obtain robust estimates

tl-l

No observations close to the y"axis

valueoffwhetl'flszeroasalloftlrc-infortr-ratiorrintlresarnpleperrail to th case whet'e t is consicle|ably h|ger thall zcro'

pledlctious 1' A similar- cilt-ttiolt should bc exct'cisccl whetr prodttcing l using vrlr.res of .f that are a long way outside the l'allge ol valuestl and 23% in the satnple. ln cxaltlplc 21, r tal(es valtrcs between 7% Droclel to detet'ntit avnilable clata. So, it would not be rclvisilble to ttse this

thecxpectedcxccsslctuf]lorlthefirrldiftlreexllectcdcxcessfetLlfn(
was expected thc lnarl(et werc, say l%' ot'30"/" or' -5% (i c' the nlat'l<c't
fal
I

).

2.4 Some further terminologY


2.4.JThe population artd the samp/e

'fl're populatiou is the totnl coilcction oJ oII objects or pcople to bc strrdfcd l

exarlple,inthecontextofdeter'rrrir-rirrgtl.rerelationslripbehl'eetrr.iskal

return|orUI(eqtrities,thepopulatiorrofit.ttclestwotrldbeaiititrreseri
lntel'preted as the value that wor-rld be tal(en by the dependent variable ' a value of zero.'Units'her.e refer to the units of measurement of .r, and r,,. So, for example, suppose that lj : t.O+, ,r is neasured in per cent and .t' is measured in t}rousands of US dollars. Then it would be said that if .r rises by 1%, .r.will be expected to rise on average by $1.64 thousand (or $1,640). Note rhat changing the scale ofr. or .r' will make no difference to the ovel'all results since the coefficient estimates will change by an off-setting factor to leave the ovelall r.elationsl.rip berween l and .r unchanged (see Gujar.ati, 2003, pp. 1,69-773 for a proofl. Thus, if tl're units of measur.ement of l. were hundreds of clollar.s instead of thousands, and everything else reurains unchanged, the slope coefficient estimate would be 16.4, so that a 1% increase in _r.wouid lead to an increase in r.of $16.4 hundreds (or $1,640) as befor.e. All oti.rer.prop_ erties of the oLS estimator discussed below are also invariant to chanscs ir-r the scalir-rg of the data.

if the independent variable.r took

Exchange (LSE) obseryations on ali stocks traded on tl-re Lonclon Stocl< infinitc' whilc a sanrple i: Thc popr.rl.rrtott trtry bc c'irlrcr lirritc ol general' eithel all of I selection of itst s0n1d ittems .fron the ltopulatiott' In

obseryationsfortlreentirepopulatiorrwillrrotbeavai]ab]e,oftheyl].lay

sonanyitrnunrbelthatitiSinfeasibletoworkwitlrtlrer-rr,inwlrichcl

is usr'rally randonr' a sample of clata is taken for analysis' The sample

saur' it should be representative of the population of interest' A randon population is equ; is a sarnple in which each individual itelu in the likelytobedrawrr.Thesizeoftlresarrrpleisthenunrbelofobservati( thataleavailable,oltlratitisdecidedtouse,inestinatingther.egress:

equatioll'

2.4.2

The populatiou i'egression

functton and ff The data generating process, the population regression sam1le regression f unction functiot't (PR-F) is a description of the mc rela||ot'tsllipI'ch4/eclltJtet,oriah]es'Tlrepoptrlationregressiotlftttrctiot.tts. PRF enrbodics'the t known as tire clata gellel'atillg process (DGP) The valttcs,lfu ;tttd li lttd is cxpt'cs:t'd ls
\'/ =
(Y -1

tiratistlrotrglrttobegenet.atir-rgtlreactua]clataanditr.ept.esentsthe

A word of caution is, however, in or.der concerning the leliability of estinates of the constarlt teilr. Although the strict interpretation r:f the
intelcept is ir-rdeed as stated abovc',
ir-r

practice,

it

is oftei-r the case that

thele are no values of.r' close to zero in the san'rple. in such i'stances, estimates of the value of the intercept will be unreliable. For. examplc,
consider' figr.rle 2.6, which demonstrates a situation where no points are close to the r,-axis.

P\t a'ttl

Noteti-ratthL.feisaclistul.balrcetefnlillthiseqrratior-r,sothltevcllll obsetvatiotls oll \ illlt haci at orle's dispos:tl tl-te cutit'e popr-rlation of

*..---.,rrlll

Intraduct1ry

Econont ettics Jor Finance

A hrief overtiew of Lhe classtcal ltnror regrcssiott tttodel Taking logarithnrs ofboth sides, applying the laws oflogs and rearran! the right-hand side (RHS)
ln

it would srill in generai not be possible to obtain a perfect fit of the line to the data. In some textbooks, a distinction is clrawn befween the pRF (the underlytng true relationship befrveen ), and r) and the DGp (the
process describing the way

that the actual obseryations on

)i corne about),

/, =

In(A

)-l f,ln X, *

rt, :

l';

although in tl-ris book, the nvo ternrs will be r.rsed synonynrously. The sample regression function, SRF, is the relationship that has been estimated using the santple observations, and is often written as

where A anci I ale Par:rllletcrs to bc cstillliitcd. Now lct rr aud -1, : ln X, .\i =
cr '1-

)rt(A)' r', =
{'

p.r,

* rt,

t-,:a*f.r,

(2.10)

Notice that there is no error or resiclual term in (2.10); all this equatior-r
states is rhat given a parricular value of.r, niultiplying it by f and aclcling d will give the model fitred or expected value for r,, denotecl i. It is also possible to rvrite

This is lcnown as all cxponsrllinl r-cg|e.ssion rrrodcl since f vat'ies accoto sotne exponeut (power.) firnction nf X. In fact, wltcn :r fcgl'essloll cL tion is expressed in'clouble logalithruic for"nr" rvhicl-r nlealls that tl-re clcpendent ancl the independent variables are uatut'al logarithms.
I

coefficient estin-lates arc irltcfPfetecl as elasticitics (stlictly, they al'e changes on a logaritl-ruric scale). Thus a coefficient estitll)ate of 1.2 for

-ri=d+lJ.r,*ir,

(2.11)

(2'13)or(2.14)isirlterpreteclasstatil]gthat.afiseirrXofl%willlea. average, everytlling else being equal, to a |ise in Y of 1'2%" Converselr


.yand'tin]evelslatl.lerthanlogaritlrrrricfor.nr(e.g.(2.9)),tlrecoefficj denote unit chatlges as described above' Sinrilarly, if theor.y sl"tggests tlrat .r slrorrld be invet.sely related to
cording to a model of the fornr

Equation (2.11) splits the observed value of .\, iltto two cornponents: the fitted vaiue from the model. and a residual term. The SRF is used to infer likely values of the pRF. That is, the estimates & and B are constructed, for the sample of data at hand, but what is really of interest is the true relationship befween .r and r, - in other words, the PRF is r,vhat is really wanted, but all that is ever available is the SRFi However, what can be said is how likely it is, given the figures calculated for d and f, that the corresponding population paraneters take on certain
values.

,li=(1+-+l/r .l/

,p,.,

the reglession can be estinated usir-rg OLS by setting


I

.lr

2,4.3 Linearity and

possible forms for the regression function

in tlre -".,r'_ hirr3ri3lg case, the relationship between.r and y must be __' -__- simnle capable of being expressed diagranratically using a straight line. More specifically, the model must be linear in the parameters (a and p), but it does not necessarily have to be linear in the variables (.t. and r). By ,linear. in the parameters'. it is nreant that rhe parametel.s are not multiplied together, divided, squared, or cubed, etc. Models that are not linear in the variables can often be made to tai(e a linear form by applying a sLritable transformation or rnanipulation. For' example, consider the follorting exponential regression ntodel
f,

In order to use OLS, a model that is linear is required, This means that,

i. Clearly, then' a surprisingly vr arrayofmodelscanbeestinateclusingoLSbynrakingsuitabletrat', nations to tl-re variables. on the otl-rer hand, some nodels are intrin-(
and regressing .r on a constant and
n.on-linear, e.g.

.\i:a+f3-t! *ut
Such nrodels cannot be estimated using oLS, but n-right be estir.Ilable a nou-linear estilration metl-tod (see chapter 8)
t

2.1.4

Estimator or estimate? Estinrators are the fbr-mulae used to calculate tlrc coefficienls - for exi, the expressior.rs given in (2.4) and (2,5) above, rvhile tl'Ie estinratc the other hancl, are the actual nunrcrical values for thc coelficient-s tli,

:,{ \:'r"'

(2.121

obtained fi'on'r the samPle.

40

Itltroductory Ecouontetrics for Finance

A liliftovflvir'1v ',1 lllt' clrls.\icrll llilcdi ri'!r'ssl(irl rir'r'ifl

2.5

Simple linear regression in EViews of an optimal hedge ratio

estimation

a.umber of other

This section shows how to run a bivariate regression usi'g EViews. Tlie example considers the situation wherc an investor wisires to hc.clge a l<.rDg position in the s&P500 (or its corrstituent stocks) using a sl-ror.t position in firtures contracts. Ir,lany acaden'ric studies assLlure that the objective of iredging is to urinimise the variance of the hedgeci por.tfolio retnnrs. [f this is the case, then the applopriatc hedge |atio (the nunrbeL of units of the futures assct to sell per unit of the spot asset held) will be the slope estimate 1i.e. 111 in a regression whele the clepencient var.iable is a time series of spot returns and the independent variable is a time series

tti.t *itt be clisclrssccl later in the book. r Statistics and Common Sample In the di now click otr Descriptive s1 box that appears, tyPe rspot rfutttles lnd click OK' Sonle sumnlary spot ancl ftlttlrcs arc plcscrlted, as displayed in screenshot tics fo| the and these rle ctttitc sitlrilar itct'oss lhe two scrics, as one would expct
nreaslrLes

@
Sulr-lr'r'larY

statlstlcs

for spot and futures

0
B

421

0 407460
0 9071 14 Li Li63B03 -B 647093

0 993048

of fr-rtures returns.2 'fhis regression will be ru' using rhe file 'sandphedge.xls', which contains monthly leturlrs for the s&p500 index (in colur.un 2) and S&p500 futures (in column 3). As described in chapter 1, the first step is to open an appropliately din-rensioned workfile. open EViews ancl click on File/New/trV'orkfiie; choose Dated - regular frequency a'd Monthly fre. quency data. The start date is 2002:OZ and the end date is 2007:07. Tl.rcn imporr the Excel file by clicking Import and Read rext-Lotus-Excel. The data start in 82 and as for the previous exa'rple in chapter 1, the first column contains only dates which rve do not need to read in. In 'Nanres for series or Number if named in file', we can write Spot Futures, The two imported series wiil now appear as objects i' the worl<flle and ca' be verified by checking a couple ofentries at random against the original returns. lt is common in academic research to use continuously conrpounded returrls rather than simple returns. To achieve tl'ris (i.e. to produce continuously compounded returns), click on Genr and irr the ,Enter Equation' dialog box, enter dfutures=100.dlog(futures). The'r click Genr egain and do the saure for the spot series: dspot=100-dlog(spot). Do not folget to Save the worhfile. continue to re-save it at regr-rlar intervals ro cr.rsure that no work is lost! Before pioceedi.g to estimate the regressior, now that we harre i'rported urore than one series, we can exanrine a number of clescr.iptive statistics together and measnres of association befween the series. Fol exaurpie, clrck Quick and Group Statistics. From there Vou will see that it is possible to calculatc the covariances or correlations between ser.ies ancl I
sce e)raptcr 8 for

291442

05612 3 542992 -0 778888 4 rt03577


"1 1

3 .",13925 -0 862431 3 985059

0 001 27 803

1353659 50
1

1068570
0.004782 30 38530 702 8542
65

37817 3787
65

Excel file. The first step is to transform the levels ofthe two series into percentage

Note that the nurlber ofobservations has reduced fronr 66 for the of the series to 65 when we computed the returns (as one obselati 'lost'in constt'ucting the / - I value of the prices in the returns forr

If you war-rt to save the summary statistics' you lllust natlle thelll by ing Name and then choose a nan're, e g. Descstats The delault na 'group01', wiricir could have also been used, Click OI(' rt \A/e can now proceecl to estirrate the |eglession. TheLe;rre seleral is to select Quick and then Estitnate Equatior do this, irut thc easiest
:l ,contlron srltrplc,\\,ill

r (letrilcd discussior of*,hv tbis is the

appropriate l)edgc ratio.

use on11,thc Paft oflltc santplc thii is rYailaLrle lirf xll th, lbr t'rch selcctcd, wltcrL'as Incliviclurl santPlc'wjll usc all il'rilable obsctwations sr serics. l:l this crse. the rtrtnrbet of obsen'atiotrs is thc sante fbr botlt intlividual ancl so jdcnfical rcsrrlts uould be obserued fbr both oPtions'

nI,odu(tory Econotnctrics far Finoncc

A brief ovcrttew

0J

lhc ri0sslcal linror regression ntodel

E@@[il estimation
Equation window
Equdtion specrfrcation Dependent variable follovied by list of regressofs includrng AR[lA and PDL terns, OR an explicit equation like Y=c(1)+c(2)-X.

r7fttf,f{il:rrrEstimation resu|rs Dependent Variable RSPOT hlethod Least Squares Date.08/09/07 Tinre 10 17 Sanrple (adjusted) 20021'103 2007 tll07

lncluded observations. 65 after adJustnrents

fspot c rfutures

Coeflicient Std

Enor t-statistic Prob

Estinration settings
Nlethod:
I

t5 -

Least Squares (i',lLS and ARMA)

sampte, i zoozr,roz

ioozuoz -..1

0363302 0444369 0817569 041S7 c RFUTURES oiiiaoo 0133790 0925781 03581 0 421 203 O 013422 llean dependent var i-rquur.O 3 542992 S D dependent var Adltrsted R-squareo u 002238 , ;;695' Akaike info crrterion 5 400342 S E of regressrorr , criterion 5 467246 Sunr squarec resro , J2.4960 Schtvarz 5 426740 Hannan'Ournn criler -'r i3 5i i i Log likelihood 2 16689 Durbin-Watson stat u 857070 F-statistic Prob(F-statistic) 0 358093
1

l--o-_lfc.^*a
will be presented with a dialog will look like screenshot 2.2.
box, which, when

it

has been completed,

In the 'Equation Specification'window you insert the list ofvariables to be used, with the dependent variabie (l') first, and including a constant (c), so type rspot c rfutures. Note that it would have been possible to wlite this in an equation fornat as rspot = c(1) * c(2)-rfutures, but this is uor-e
cuntbersome.

In the 'Estimation settings'box, the default estimation method is OLS and the defauit sample is the whole sample, and these need not be modi' fied. Click OK and the regression results will appear, as in screenshot 2.3. The parameter estimates for the intercept (d) and slope tBt are 0.36 and 0.12 respectively. Name the regression results returnreg, and it will uow appear as a new object in the 1ist. A large number of other statistics are also presented in the regression output - the purpose and interPretatron of these will be discussed later in this and subsequent chapters. Now estimate a regression for the levels of the series rather than the returns (i.e. run a regression of spot on a constant and flltures) alld examine tl'le paranleter estimates. The return reglession slope paratleter estilnated above measures the optimal hedge r-atio and also ureasurcs

series By contrast' the ' the short run relationship berween the tlvo and futures indices (o parameter tn a regr-ession using the raw spot the futures series) can be interp log of the spot serles and the 1og of beween them' Tl'ris isstte o as nreasuring the long run relationship in chapter 4 For now' iong and short runs will be discr-rssed in detail and enter tl.re variables spot c futures rr Quick/Estimate Equation name the regrer eluation Specification dialog box' click OK' tiren (d) in this regressiot] is results 'levelreg'. The intelcePt estimate can be conside-r'ed i and the slope estrnate (P) is 0'9S' The intercept the long-ternr relatio proxitnate the cost ofcarry, while as expected' fol ftr futures prices is altnost 1:1 - see chapter 7 benveen spot and

discussionoftheestilllationandilrterpretatiolrofthislong.tel.tlrrel.
rvhole wolkfile' ship. Finally, click the Save button to save tl're

2.6

linear regression The assumptions undellying the classical

cierivecl allove' togetltet The rlroclel .\'r :.r * pt, * rr, that has been thelssrrtrlptiollsllsteC]beiow,isknownastheIl(]ssicdllirlcrtt.tt'.c'l.t'.sstott

,.

rrt

I |lt

If ]1

Introductory Econonrctrics for Finance

A brir'fovrrvicw lrf lhc clo.ssical littcat' rcgrc'ssirtn ntodli

* ,Best,- utc;llts that thc OLS lsti,tator. f h.s 'rinintuttt variancc al' ther ti.rc. clrss of linear unbiasccl cstitrlrft.rt's; tltc causs-Markov
Technical notation

lnterpretation
The errors have zero mean The variance of the errors is constant and

pfovesthlltthe()I,Scstj|11:ltofisllcstbyc,xarllitritrgatrarbitt.ary
trittivclilre:rl.r-lnbiascclt,stitlratot.allclsltorvirlgil'lallcaSeSthatit have a vaLiatlcc tto strlallct th;ltl I ltc OI-S cstilllatot.'

(1) E(u,) = 0 (2) va(rr,) = o2 < oo (3) cov(a;,

r;) =

finite over all values of:, The errors are linearly independent of one another
There is no relationship between the error and corresponding .{ variate

Utlclct.:rssrtttrpliorlsl*4lisfcclirllovc,tltcol-Scstill.}iltofclttbcsl
tohavetlrcclcsir.ablclrt.<l.1lct'ticstlliltitiScotlsistct.tt,rttibiasedatrc
cicnt. Unbilseclness atl<l cfflcicrlcy ltltvc rlt'cady beclt discussed cotrsistencyrsltrlaclditiorllrltlcsir'ablc])fopcfty'Tl-tcscthr.cechat.irctct r,vill tlow be cliscr'tssccl itl tttt'tr'
abovt'

(4) cor'(ri,, r,) = 0

but since r', also depends on r/, it is necessary to be specific about ho"v the rr, are generated. T'he set of assumptions shown in box 2.3 are usually made concerning the rr,s, the unobservable error or distulbance terms. Note that no assumptions are made concerning their observable counterparts, the estimated rnodel's residuals. As long as assumption I holds, assurlption 4 can be equivaler-rtly writtell E(.r,rr,) = 0. Both fonnulations imply that the regressor is orthogoral to (i.e. unrelated to) the error term. An alternative assunption to 4, which is slightly stronger, is that the -rr are noll-stochastic or fixed in lepeated sarnples. This means that there is no sampling variation in.r,. and that its value is determined outside the model. A fifth assurnption is required to make valid inferences about the population paraneters (the actual n and B) from the sample paran'leters (d and 13) estin'rated using a finite anount of data:
(CLRM). Data for'"r, is obsenrable,

2.7.1-

ConsistencY
Tl-re least squares

estitrlrtots 'r rnd 7i arc collsistellt One way to staf for t?) ts algebr.aicallv {br f (with rhe obvious ntodifications made linr Prlll3 7-.r

- ll :' Sl: {)

vd :-

(Pr) that l' is This is a technical way of statirlg that the plobabiliry llxed distallce 6 away fromits true value tel' thausolne albitrary o1 zero as the satlple size tends to infinity' for all positive values

tlrelirnit(i.e.foranilrfinitentttrrbetofobseivatiotrs).theprobabil theestirnatorberngciiffer.erltffolnthetrr.levaltteiszero'Thati
estin'lates

will

converge to their ttrte values as the sample size inc

(5)u,

N(0, o21-i.e. that u, is nornally distributed

toinlirriry.Consistencyist]rr.rsalar:gesarlrple.oIaSyll]ptotlcproperl deri'' assumptions that E(t,rr,) : 0 and E(tr,\ = 0 are sufficient to of the OLS estinator' consistency

2.7.2

Unbiasedness The least squares estil)1ates of a ancl

are unbiased' That is

2.7

Properties of the OLS estimator If assumptions l-4 hold. then the cstimrtols d, and p detcrminccl by OLS
and

E(d,)

will

have a number of desirable propelties, and ale l<nown as Best Linear' Unbiased Estimators (BLUE). \Mrat does this acronym stand for'?

E(fl)
-l.hus,

ft

1 'Estimatol'' - ri and

arc estinrators of the true value of a and /3 r. 'Linear'' - c! and f ale lineal estilrators - that nreans that the fot'n-tulae for cr and 13 ale linear combinations of the randorn valiables (in this

et on avelage, tl.re estimatecl values for the coefficierrts wili'be Ti-rat is, thel'e is llo systeInatic ovefcstilnatioll of their true valnes.

case, r')

'Unbrased'- on average, the actual values of a ancl p wiil bc eqLral to

estilllatiolroFthetftlecoe|ficiellts..fopl.ovetlrisalsot.cqrtilesthel tion thrt cov(/r/. \, ) = (). Clc'arly, ttubiaseclness is a stlouger conditio saulples (t'e' cousistcncy, siuce it l-rolcls fol surall as u'ell as lalge
samPlc stzes)

their true values

___-

'--

'

-,.

'

I t

Illtroductlry Eccnometics for Finance

A ilriclcrvcrvieu, of lhc cla'ssicol lirtt'or rt:grcsston ntodtl

2"7.3

Efficiency

An estimator p of a parameter fl is said to be efficient if no other estrmrtor has a smaller variance. Broadly speaking, if the estimator is efficient, it will be minirnising the p.obabiliry that it is a lorg way off from the true value of p. In other words, if the estirnator is 'best', the uncertai'ry associated with estimation will be minimised fol the class of linear un. biased estimators. A technical way to state this would be to say that an efficient estlmator would have a plobabiliry distribution that is nar.r.owly dispersed around the true value.

valnes {or. t5e coe{ficicnts. It t'an bc scctr that they arc a fttnctici the, actual obscr.vrtior.rs on thc cxirlaDatoty varirble, .f, thc saurple 7., ancl anothcf terlr, i. Thc l;rst of thcsc is uD cstitnate of tlie varj of the disturl;;ulce fcl'nt.'l-hc uctual vat-iatrcc or thc disttll'ballcc tcl usualiy detrotc'cl by ol. llow tlttt ittt cstitt.titte of ol bt' obtaillcd?

2.8.7

Esttntating the variance of the error term (o:) Front cler.r.rcr.rtitfy stalistics, the vrr-i;tuce of a t'atrclotl'l valiable ir, is giv, var(tt,)

I',1(rr, )

[:(rr1

)lr

2.8

Assr.tnrptiou

Precision and standard errors


Arly set of regression estinrates ri and 13 a'e specific to the sarnple nsec in their estimation. In other words, if a different sa'rple of data was selected fron-r within the population, the data points (the.r, ancl r,,)rviil be different, leading to different values of the OLS estinates. Recall thar the oLS esrinators ld and d) are given bv (2,a) and (2.5). It would be desirable to have an idea of how'gooci'these estimates of a a'd B are in the sense ofhaving sonle measure ofthe reliabilifv or precision of the estimators (ri' and 79;. tt is thus useful to k'ow whether one ."n hou. confidence in the estinrates, and whether they are likely to vary much from one sample to another sample within the given popuiation. An idea of the sampling variability and hence of the precision of the estimates
can be calculated using only the sampie of data available. This estimare is given by its standard error. Given assumptions 1-4 above, valid estimator.s of the standard errors can be shown to be given by
,S6(d)

l of thc CLRM was that thc cxpectcd or average value errols is zct'o. Under this rssr"rnlption, l2-22) above t'educes to
var(rr, )

= EIrri]
t

what is requir.cd is an cstinrate of thc'average value of rrl, wl'rich be calculated as


So

.- _ - \ 'ls.

7'u

'

Unfoltunately (224) is tlot woll<able since rt, is a series of popttl disturbances, which is not observable. Thus the sample counterpart which is ri,, is used

,t=f)-,il Tu'
Brrt this estinator is a biased estimatol of or. An r-rnbiased estilr ,rl, rvould be given by the following equation instead of the previou

, I,i;
'-Tl

.r

(([..;)- ri,)

(2.20)

where

vance for the standard error formulae is the square root of


12.211

f n/ is the residual sum of squales, so that the quantity


lu

ol

(2'261

sr(r)

.,

/tn;
'

\ 7'-2
s is also known as the standard error of tl'Le regressian oI the stalldald of tl-re estimate. It is sometirnes used as a bload measure of the fit regression equation' Everything else being equal, the smaller this qtr is. tl.re closer is tl're fit of the line to the actual data.

whe'e s is the estimated standard deviation of the residuals (see below). These folnulae are derived rn the appendix to this chapter. It is wolth noting that the standard errors give only a general indication of the likely accllracy of the regression parameters. They clo not shorv how accurate a palticular set of coefficient estirnates is. If the standar.d erfors are small, it shows that the coefficients are likely to be precise on average, not how precise they are for this particulai.sample. Thr.rs standard errors give a measure of the degree 0f u'rcertainty in the estinrated

2.8.2

Some cornrnents on the standard error estlmators

It is Possible, of course' to

clerive the formulae for the standrl'd of the coefficiellt estinates frorn first plinciples using soure algebt'

Hr-

48

Introductory Econonxtrics Jor Finance

A hrfu'f ovovie tv of lhc rla.ssirrri lincrtr- rt'gt'c.ssiott ntorii:l

this is left to the appendix to this chapter. Sorne general intuition is n.:-., given as to why the folrnulae for the standar.d eLrors given by (2.20) :rld (2.21) contain the telns that they clo and in the form ttrat they clo. l'ire presentation offered in box 2.4 loosely follows that of Hill, Gr.iffitl-rs and .fudge (1997), wl.rich is the clearesr that this aLrrhor h.rs scen,

across a long sect;on of the line, so that one could hold more confldence ir
estinrates in this case.

(4) Thetermf.rf affectsonlytheinterceptstandarderrorandnottheslopestan


.r,2 measures how far the points are away from the y'' error. The reason is that Consider figures 2.9 and 2,IO. ln figure 2.9, all of the points are bunched a long way from the y"axis, which m it more difficult to accurately estimate the point at which the estimated line cro the _y'axis (the intercept). ln figure 2.10, the points collectively are closer to

(1) The larger the sample size, I, the smaller will be the coefllcient standard errors.
fappears explicitly in .SE(d) and implicitly jn SE(lq). fappears impliciily since the f (-r, *i)r is from r = | Io T. The reason for this is simply that, at least for now, it ls assumed that every observation on a series represents a piece of useful information which can be used to help determine the coefficient estimates. So the
sum
Iarger the size of the sample, the more information wrll have been used in estimation

@
(.1,

Effect on tne standard errors of the coefflcient estimates when

ofthe parameters, and hence the more confidence will be placed jn those estlmates. (2) Both St(&) and .t6(p) depend on s2 (or sy. Recall from above that sr is the estimate
of the error variance, The larger this quantity is, the more dispersed are the residuals, and so the greater is the uncertainty in the model. lf s2 is large, the cjata points are collectively a long way away from the line.

- i) are widely drspersed

(3) Thesumofthesquaresofthex,abouttheirmeanaDpearsinbothformulae-since

I (r, - i)2 appears in the denominators. The larger the sum of squares, the smaller the coefficient variances. Consider what happens if f (.r, - i)2 is smalj or large, as shown in figures 2.7 and 2.8, respectively. ln figure 2.7, the data are close together so that f (x, i)2 is small. ln this ftrst case, it is more difficult to determine with any degree of certainty exac|y where the line should be. 0n the other hand, in figure 2.8, the points are widely djspersed

@Eil on the
Effect

EmuEI
rf
Effect on the standard errors of large

standard errors of tho coefficient estimates when (.xr - i) are narrowly dispersed

50

Introductory Econornctrics for Finance

A brit.f ovtniew of the clossicnl linaar rrgression model s Now, tr,r|ning to the standrl'd crr"ot'calculations,

Effect on the slandard errors of .r; small

BtrfTTI

it is necessary to o)

an cstinrate, r, of the cl'ror vanancc

.tEtrt'r'ii'rrl,'itt. r -,/ , ." /


Y,
(/./,;' I \s

/>-ai
-l

-,

/
! ll

i'iltui v -l()
r.tu

: 255
rq tqo-s+

t,loSi

12 z J 1r,.5:

- 1 t{

'\Arlir: 155 '

\/..ur,urs+

- rr -lrn+

r)rrlt7g
as

With tl're staudard eI|o|s calculated, the results are written

ii = -59,l2+0,.15.r/
(,1.35) (0.0079)
The standald error estin)ates arc r.tsual)y placed in paretttheses unclt lclcvanI cocffi cicnt c:.til]l.rtcs.
x
the y-axis and hence it will be easier to determine where the line actually crosses the axis. Note that this intuition will work only in the case wnere all of the r, are positivel

2.9

An introduction to statistical inference


Often, financial theory will suggest that certain coefficients sl-toulcl on prrt jcular valucs, or valrres witltin a given range. It is thus of irr to deternrine whether the relationships expected from financial tl are r-rpl'reld by the data to hand or not. Estinrates of n and li have obtained froni tl-re sauple, but these values are not of any pal'ticul terest: the population valr"res that describe the true relationship bet the variables would be of lnore interest, but are never available. II't: inferences are made concerning the likely population values from tl gression paralneters that have been estimated from tl're sample o1 to hand. In doing this, the aim is to determine whether the differbetlveen the coefficient estimates th:it are actually obtained, and exl tions arising frorn financial theory, are a long $'ay frot.tr one altothc statistical sense.

Example 2.2 Assume that the following data have been calculated from a regrssion of .r'on a single variable.r and a constant over 22 observations

\-,. ,, - prnrnr r = 22. .i = 416.5. /2.-r.rt ' \-': * iQrs6sl PSS = Ilo,6 = /-^t

.r-'= 86.65.

Determine the appropriate values of the coefficient estinrates and their standard errors. This question can sirnply be ansrvered by plugging the appropriate nuntbers into the formulae given above. The calculations are

, r?

830101
_iq

86.65

* 12 x (116.5 )l 0.3.5 x 416.5 : -59.12 1q654

Q2

416.5 > 86.65

Example 2.3
Sr-rppose

the following regression results have been calculated:


(

i,
,l =

The sample regression function would be written as

=20.3*0.-5091.r,
14.38) 10.256I
)

.)i:(}+trl.\/ i, : _-59. tl *

0.35.r,

single (point) estimate of the unknowt.t popr.rlation p eter, /4. As stated above, the reliability of the point estitttate is ttrer
O.SOSf is a

Introductory Econlmetrics Jar Firlance

A bricJ ovct-victu o/ thc r/rrs.sical littt'rt

cgt-cssion ntoricl

etc. Answers to these quesfions can be obtainecl tl-rrough hypothc.sls

by the coefficient's standard error. The information fronr one or nrore of the sample coefficients and their stanclard errol.s cau be used to inferences about the population para',eters. so the estimate of the 'rake slope coefficient is f = o.sort, but it is obviors that this is rikely to 'umber \fary to sonre degree from one sanrple to the next. It might be of inter.est to answer the question,'ls it plausible, given tliis estil:rate, that the tlue population paraneter, B, could be 0.5? Is it plausible that could be t?,, f
rc.sfi rrg.

l r

case, the
H11

0.5, coil'esPoncling to an iltcrcilse in t'isl<, is not of interest. Il1 thl null ancl altct'nattve hypothcses wottld be specificd as

:P=05

11, fl.:0.5
This

pliol infot'ttt;rtiott shottld cotrtc frortt thc firtrtncial theory of thc Prol lertr uncler <:onsidclatiorr, ;rncl not flour nn cxamin;rtion of the estintatc' value of thc coefficicnt. Notc thilt thclc is always au cqualify ttnder th null hypothcsis. So, fbl cxltttrple, l. 0.5 wottld not be specified undt
thc ntrll hyPothesis.
I'he|e ale fwo ways to concluct a hypothcsis test: via thc tcsf of sigrriJican' ol vi;r lltc con.fidcncc irtlcnrtti approach, Both nret'hods celltre c a statisticfll compalison of the cstinratccl vitlr.re of tl-re coefficient, ancl i' value undcL the trr-rll hypothesis. Itt vet'y geueral tcrtl'rs, if the estinlatf value is a long way away fLonr tl-rc hypothesised value , the nul1 hypothes is likely to be rcjected; if the valtte trudet'the null hypothesis and the est mated value ale close to otte anotl.tcr, the r-rull hypothesis is less likell' t be rcjectecl. For exarttple, cotrsidet'/'i:0.5091 as above' A hypothesis th, the tnre value of ,4 is 5 is nrot'e likely to be rejected thrt'r a null hypothes that the trlte value of f) is 0.5. What is lequir-ed now is a statistical dccist' rule that will perlr-rit the formal testing of such hypotheses.
approach

2,9,1- Hypothesis testing: so/re concepts


In the hypothesis testing framelvork, the'e are always two l-rypotheses that go together, l<nown as the rrriil hypothesis (denotecl H0 0r occasio'ally Hr) and the alternLltive lrypothesis (denoted Hr or occasionally Ha). The null irypothesis is the statement or the statistical hypothesis that is actually bei'g tested. The alternative hypotl'resis represents the rerlainins outcomes of
nterest. For example, suppose tl-rat given the .egression .esnlts above, it is of rnterest to test the hlpothesis that the true value of is in fact 0.5. The f
l

fbllowing notatiorl would be used.

H6:f-0.-5
Hr

:f#05 f
cor"rld

2.9.2

The probability distribution

of the /east squares estlmators

not 0.5. This would be known as a ru,o-sided test, since the outconres of both li < 0.5 and p > 0.5 are subsumed under the alternative hypothesis.
Sometimes. sone prior information may be availabre, suggesting for. example that p > 0.5 would be expected rather than p < O.S.ln this case, f '< 0.5 is no longer of interest to us, and hence a one-sided test would be conducted:

This states that the hypothesis that the true but unknown value of be 0.5 is being tested agai'st an arternative rrypotrresis where p is

Ho:f=0.-s
Hr

tf>0.5

The nolual distribution is a couvenient one to r-rse for it involves on flvo parameters (its tnean and variance). This makes the algebra invoh't in statistical inference considerably sinrpler than it otherwise would ha' been. Since.r; depends Partially on tr,, it can be stated that if rr, is nornal distributed, .r', will also be normally distributed. Further, since the least squares estimators are linear courbinations ' the random variables, i.e. 14: f u,,t,r, whete u', are effectively weighr and since the weighted suur of normal random variables is also normal distributed. it can be said that the coefficient estimates will also be nt
n.rally

In order to test h'?otheses, assuruptiou 5 of the CLRM nust be tlse' namely that i// - N(0, or) - i.e. that the error term is nornrally distribute'

Here the null hypotrresis that the true value of is 0.5 is beirrg tested B agalnst a one-sided alternative fhat , is nrore than 0.5. on the other hand, one colrld envisage a situation where there is pr.ror. i.formation that 19 < 0.5 is expected. For example, srrppose that an ,tvestment bank bor"rght a piece of new risk managenrent soffwafe thar is rntended to better track the riskiness inher-ent in its traclers'books ancl fhat , is sone measure of the risk that pr.evior-rsly tool< the vrlue O.S. clearly, it would'ot nake sense to expect the'isk to have r-ise', ancl so

distlibuted. T}tus

N(a.

var(a)) rrtd

p .-

S16, vrrlB))

Will

tl're coefficient estirrates still follow a norntal distlibution if thc ' ro|s cio not follow a trorntal dist|ibution? Well, briefly, tlte rusu'er is r.i: ally'yes', provided that the other assurnptions of the CLRM hold, atrd t saurple size is sufficiently large. The issue of non-not'maliry, how to tt fol it, and its cotrsequences, rvill be further discussed in chapter 4.

54

lntroductory Ecotrometrics

for

Finance

A brief oventiew o.f Lltc classical lintru' rcgression nlodel


lr

rj|.5?.errr
The normal
di

fable

2.2

stribution

Critical values from the standard normal versus t-distribution Significance level
50%
5"/o

(%)

N(0,1)
1.64 1.96

trc

2.5% 0.5%

2.57

00 1.68 ').0) 2.70

2.13

2.78 4.60

@fEg t'distribution
The

versus lng normal

.T

Standard'ormal variables can be constructed from d and, fl by subtract. ing the mean and dividing by the square root of the variance

cr-cl

viao

'

N(0,

l)

and

fl*B
,/ var( fi\

r r\\r!

I /

Replacing the true values of the standard errors with the sample es_ timated versions induces another source of uncertainty, and arso means that the standardised sratistics follow a r-distributio'with r - 2 degrees of freedom (defined below) rather than a normal distribution. so
eu

The square roots ofthe coefficientvariances are the standard error.s. unfortunately' the standard errors of the true coefficient values u'der. the pRF are never known - au that is availabre are their sample counterparts, trre calculated standard errors of the coefficient estimates, Sf(a) ana Sffi4).4

*ct

R_R
rn,.J
+ltrr

JI](A' -!/,1

sE(p) ''-z

This result is not formally proved here. For a fornal proof, see Hril, Criffiths and Judge (1997, pp. BS_90).

2.9.3

A note on the t and the normal distributions Tire normal distribution, srrown in flgure 2.11, sr.rouid be famiriar to read_ ers. Note its characteristic 'beit'shape and its symmetry arou'd the rlean (of zero for a standard normal distributioni.
a strjctly' these are the estinrated standard euors condjfional on the paramcter estirrnres. and so should be denoted S6(A) and .rE(B). but the additional Iayer ofhats \vill be ontitted lter since the nteaning should be obvious front the context.

A normal variate can be scaled to have zero rnean and unit vatian, by subtracting its mean and dividing by its standard deviatlon. There is specific relationship befween the r- and the standard normal distributio and the t-distribution has another parameter, its degrees of freedom. \&rtrat does the t-distlibution look like? It looks similar to a nornl distribution, br.rt with fatter tails, and a smaller peak at the mean, shown in figure 2.12. Sone exanrples of the percentiles from the normal and l-distributio; taken from the statistical tables are given in table 2.2. \Mren used in tl context of a hypothesis test, these pelcentiles become critical values. Tl values presented in table 2.2 would be those clitical values appropria for a onc-sidcd tcst of thc given significancc lcvcl. it can be scen that as the number of deglees of freedom for the distribution incLeases frour 4 to 40. the clitical values fall substantial Irr figule 2.12. lhis is rcprescnted by a gradrral incrcasc irr thc hcigltt tl-re distribution at the centre and a reduction in the fatness of the tails the number of degrees of freedonr increases. In the limit, a l-distributic' with an infinite number of desrees of freedorn is a standard nornral, i

56

Introductory Econometrics for I:inance

A lrrrrt' overliew rrl fhe rirrs-sical lnaar rt'g,rcssion ntodcl

r- :
the
t.

N(0, l), so the norntal distribution can be viewed as a sDecial case of

tffir|grFf
Rejection regions for
a twosided 5%

.f(x)

Putting the limit case, /a, aside, the critical values for the r-distribution are larger in absolute value than those from the stanclar.d nor.mal. This arises from the increased uncertainfy associatecl with the situation where the errol variance nlust be estimatcd. so'ow the l-distr.ibution is used, and fora given statistic to constitute tlre same amount of re'liable evidence against tlre nr.rll, it has to be bigger in absolute value fhan in circuurstances where the nomtal is applicablc. Ther-e are broadly two approaches to testiltg hypotheses uncler reglession analysis: the test of significance apploacl'r ancl the conficlcnce inter-val apploacl'r. Each of these will now be consider.ed ir.r tnr.n.

hvoothesis te$t

2.5Yo

95% non-reieclion region

reiection region

2,9.4

The test of significance approach

AssulDe

1.2.....
in box

f.

the regression equation is given by ti :

u I p.t., * tt,. r : The steps involved in doing a test of significar-rce ar.e shown

n.)

2.5.

Rejection region for a one-sided

hypolhesis test of the form

h: li = F-' H':li<fr
(1) Estimate A, p and SE@), SE(.b in the usuat way. (2) Calculate the test statistic. This is given by the formula
ICSL.loltsltC: ------;-

B-8,
sL(p)

95% non-rei{iction region

/? ?nl

where p- is the value of p under the null hypothesis. The null hypothesis is Ho : 19 = ,0- and the alternative hypothesis is Ht : fl * p- (for a two_sided test). (3) A tabulated distribution with which to compare the estimated test statistics is required. Test statistics derived in this way can be shown to follow a /-distribution with I - 2 degrees of freedom. (4) choose a 'significance level', often denoted d (not the sanre as the regression intercept coefficient). lt is conventional to use a significance level of 5%. (5) Given a significance level, a rejection region and non-rejection region can be de. termined. If a 5% significance level is employed, this means that 5% of the total distribution (5% of the area under the curve) wiil be in the rejection region. Trrat rejection region can either be split in half (for a two-sided test) or it can ail fall on one side ofthe .r-axis, as is the case for a one_sided test For a two-sided test, the 5% rejection region is split equally between the two tairs, as shown in Iigure 2.13. For a one-sided test, the 5% rejection region is rocated sorery in one tair of the distribution, as shown in figures 2.!4 and 2.1s, for a test where tlre alternaIve is of the 'less tl'ran' form, and where the alternative is of the 'greater than' form,
respectively.

EMil
H,:A>

/ (x)

Relection region for a one-sided hypothesis test of the form

H1:f=fi-,
B-

95% non-reieclion region

IC ieclion

5"/. region

\/

Introductory Econometrics for Finonce

A brieJ ovewiew of thc classical lineor rcgression s

mod.el

(6) use the /-tables to obtain a critical value or values with which to compare the test
statistic, Ttle critical value will be that varue of
region.

that puts 5% into the rejection

(7)

Finally pedorm the test. If the test statistic ries in the rejection region then reject the null hypothesis (H6), else do not reiect H".

but this difference is 'normalised'or scaled by the standard error.of the coefficient estimate. Tire standard error is a measure of how conflclent one is in the coefficient estimate obtained in the first stage. If a standard error is small. the value of the test statistic will be large relative to the
case where the standard error is large. For a srrall standard error, it would not Iequire the estin-rated and hypothesised values to be far away from one another for the null hypothesis to be rejected. Dividing by rhe standard error also ensures that, under the flve GLRM assumptions, the test statistic follows a tabuiated distribution. In this context, the number of degrees of freedonr can be interpreted as the number of pieces of additio'al infor'ration beyond the rninimun.r requirement. If rwo parameters are estimated (a and p - the intercept and the slope of the iine, respectively), a nrinimum of fwo observations is required to fit this line to the data. As the number of degrees of freedom increases, the critical values in the tables decrease in absolute terms, since less caution is required and one can be more confident that the results

Steps 2-7 require further comment. In step 2, trre estimatecl var'e of p rs cornpared with the value that is sr"rbject to test undef the null hypothesis,

However, one potential problem with the ttse of a fixed (e'g. 5o ) srz of test is that if tl'te sample size is sr,rfficiently large, any null hypothesi can be rejected. This is palticr"rlar-ly wolt'isoltle in finance, where tens t' thousancls of observations or lllore are often available, What happens i that the standard errors |eclrrce as the santple size increases, thUs leadin to an iucrease in the value ofall r-test statistics.This plobiem is frequentl overlooked in enrpirical wot'k, but sollle econometlicians have suggestc that a lower size of test (e.g. 1%) should be r-rsed for large samples (see' ft' example, Leanter', 1978, for a discussion of these issues). Note also thc r.rsc of tcr-nrinology in counection with hypothesis test' it is saicl that the null hypothesis is either rejected or not rejected. It incorrect to state that ifthc null hypothesis is not rejected, it is'acceptet

(although this error is fi'equently made in practice), and it is never sai that the altet-native hypothesis is accePted or rejected Oue reason wl it is not sensible to say that the null hypotl]esis is'accepted'is that is impossible to know whether the nnll is actually true or not! In an given situation, mauy null hypotheses will not be rejected. For exatlpl, suPPose that Hn : p : 0.5 atld Ho : d : 1 a|e seParately tested against tlr relevant two-sided alternatives and neither null is rejected. Clearly then woulcl not tlake sense to say that'H1) : p :0 5 is accepted'and'H6 : p = is accepted', since the true (but ur-rknown) value of ,4 cannot be botlt 0 ancl 1. So, to suntnarise, the trull hypothesis is either rejected or nt rejected on the basis of the available evideuce.

2.9.5

The confidence interval approach to hypothesis testing (box 2'6)


To give an example of its usage, one n'rigirt estimate a palameter, say p' be 0.93, and a '95% confidence interval'to be (0.77' 1.09). This means th'

are approprrate. The significance level is also so'retimes called the size of the test (note that this is compietely different from the size of the sample) and it defermines the region where the null hypothesis under test will be rejected or not rejected. Remember that the distributions in figures 2.13-2.15 are for a random variable. purely by chance, a random variable rvill take on extreme values (either large and positive values or large and negative values) occasionally. More specifically, a significance level of 5?6 means that a result as extreme as this or more extreme would be expected oniy 5% of the time as a consequence of chance alone. To give one illustratio', if the 5% critical value for a one-sided test is 1.68, this irnplies that the rest statistic would be expected to be greater than this only 5g,o of the tirrre by chance alone. There is nothing magical about the test - ail that is done is to speci8/ an arbitrary cutoff value for the test statistic tl-rat deternrines whether the null hypothesis would be rejected or not. It is conventional to use a 5% size of test, but l0% and 1% are also commonly usecl.

repeated samples, 95% of the time, the trtte value of p will contained rvithin this interual. confldence intervals are almost invariab

in many

estimated in a fwo-sided folm, although in theory a one-sided inten' can be constructed. constructing a 95% confidence inten'al is equivalel to using the 5% level in a test of sigr-rificance'
The test of srgnlflcance and confidence interval approaches always give the same conc/usion

2.9.6

Uncier the test of significrllce rpirt'oach' the nttil hypothcsis thnt p = u,ill not bc rejected if the test statisfic lies rvithirl tltc trorl'tejcctiott t'r'gtc' i.c. if the foilorving cor.rditior.r holds

-t,,tr '

,4

* rl.
)

)Lll -

-.*t,,',

60

lntroductory F,conometrics fot' Finance

A brief ottetlittu of fhc cia.ssicol lineor

reg,rcssictn ntodel

6i

(2) Choose a significance rever, c (again the convention is choosing a (l _ a)tjoo/" conf,dence interval
i.e. 57o significance level

(1) Calculate A, B and

SE(d), SEG) as before.

Iest of srgnlfcance appraach


5%). This is equivarent to

Co nfi d e

ce i nte r v a I a pproa ch

i p. tt.sr slol =tz S E(fr)

Find t,,.;,

957o confidence interyal

0.5091

-l

ho.sq

*2.086

degrees of freedom, (4) The confidence intervat for B js given by


(B

(3) Use the r{ables to find the approprlate critjcal value, whjch will again have Z*2
Find
t,.,.,,

0.2-56 |

- -'.''1
pL

,.,,

ll rr = *2.086

t,,t, S E( B) = 0.5091 i 2.086 0.2561 = (-0.025r,1.0.133)

- Gir. stfl, B +/-ir. s6(l))

Do not reject ll0 since test statistic lies within non-rejection reglon

Do not reject Ho since

lies

within the confidence interval

Note that a centre dot (.) is sometimes used jnstead of a cross (x) to denote when two quantities are multiplied together. (5) Perform the test: if the hypothesised varue of p (i.e. B.) rres oulside the confrdence intervar, then reject the nuil hypothesis that B = B-, otherwise do not reiect the nuil.

Rearranging, the null l.rypothesis would not be rejectecl

if

i * f' 1- | ,,t. SErf t -r.,.;,. = r.e. one would not reject if B - t,,,,. Sf(p) S f. t,+ t,,.it. SE(B)
SErf,t

the same conclusion by construction. one testing approach is sir.nply algebraic rearrangement of the other.
Example 2.+ Given the regression results above

just the rure for non-rejection under. trre confidence interval approach. so it will arways be rhe case that, for a given signiflcance revel, the test of significance and confidence intervar approaches will provicle
But this
is

a'

'fhe rcsults of the test accor.cling to each approach are shown in box 2.' A couple of coniltlents are in ot.der.. First, the critical value from th l-distribution that is r.eqr,rir.ed is for 20 degrees of fi'eedom and at the 5' level. This ntealts that 5% of the total clistribution rvill be in the |eje, tion region, and since this is a rwo-sided test, 2.5% of the distributior is requirecl to be containecl in each tail. From the symmerly of the ciistribution aroltnd zero, the critical values in the upper and lower ta will be equal in rnagnitude, but oPposite in sign, as shown in figure 2 1r what if instead the researcher wanted to test H0:d = 0 or Ho:f = In order to test these hypotheses using the test of significance approacJ the test statistic would have to be reconstructed in each case, although tl: cr.itical value would be the same. On the other hand, no additional u'ot would be required if the confldence interval approach had been adopter

@Eil values
(2.3'1)

/(.r)

,r;:zO.l*0.5091.r,
(I

and Critical rejection regions for

4.38) (0.256 I )

the h5pothesis that p : 1 against a two-sided alternative. This hypothesis might be of interest, for a unit coefficient on the explanatory variabre implies a 1:1 relationship berween movenrents in.r.and .,-'o'e,'ents i'
The null and alter.native hlpotheses are respectively:

using both the test ofsignificance and confidence interval approaches, test

r,.

2,SYo

rejeclion region

95% non-reiection region

rejection Iegion

2 'SYo

:fl-1 Ht:fi*l
H11

\
-2.086
+2.086

52

lntroductory Econometrics

for

Finance

A brief

ovct,rtit:ru ctf the classical lInacn" rcgrcssion ntodel

since it effectively permits the testing of an infinite number of hypotheses. So for example, suppose that the researcher wanted to tesr
H1;

:f=0 :ft'0

versus
H1

the effect of tlie size of the tcst on the conclrtsiou is easier to addres r-rndet' the tcst of significatrce appl'oach' caution shoulcl rher.efor-e bc usccl when placing emphasis on or makitr clecisions i11 the coutext of ntarginal cases (i.c. in cases where the nu is only.just lejcctcd or lrot rc.iectccl). In this sitrtation, the appropIirt conclusion to dt';tw is th:lt thc rcsults lit-c tnat'gitlal rnd that no strotlg ir
lerencc caD lre nt:rclc ouc wilv Or-thc othcr. A thoroLlgh eu'rpi|ical anaiys shoulcl involve concluctiDg a scnsitir.'ity:rrlalysis otl the resttlts to detc

fr

and

Hr:lJ=2
versus H1 :

fi

lZ

In the first case, the null hypothesis (that f = 0) would nor be rejected stnce 0 lies within the 95% confidence interval. By the same argument, the second null hypothesis (that f =2) would be rejected since 2 lies outside the estimated conf,dence interval.
On the other hand, note that this book has so far considered only tl.re lesults under a5% size of test. In marginal cases (e,g. H6 :19 1, where the = test statistic and critical value are close together), a completely different answer may arise if a different size of test was used. This is where the test of significance approach is preferable to the construction of a confidence interval. For example, supPose that now a 70% size of test is usecl for the nuli irypothesis given in example 2.4. Using the test of significance approach,
te.tr

ntine whctitcr.usir-rg a cliflircrlt sizc of tcst alters the conclusiorls lt wofth st;ttiltg again that it is convcntronal to considel' sizes of test of 10' 5,% ancl 1{x,. If thc conclusion (i.c. 'r'e.iect'or 'clo trot |eject') is robnst changes ir-r the sizc of the tcst, then One can be tnole confident that tl, conclusions a|e approp|i;lte. If the oLltcollle oF tlre test is qualitatively: tered when thc sizc of the test is n'rodifiecl, the cotrclusion n-ltlst be tlt, there is t'to conclusion olle way ol' the other! It is also \vor.th uoting that if a given r-rr.rl1 hypothesis is rejected using
t

significance level, it will also auton.ratically be rejected at the 5% Iev( so that thele is no need to acttlally state the latter' Dougherry (199 p. 100), gives the anaiogy of a high jumper. If the high jumper cau cle 2 metres, it is obvious that the juntper could also clear 1.5 netles. Tl levt 1% signiflcance level is a l-righer hurdle than the 5% sigDificance significance, lt wr Similarly, if the null is Dot rejected at the 5% level of autonratically not be rejectecl at any strongel level of significance (e.g. 1') In this case, if the jumper carlnot clear 1,5 metres, there is no way s/1 will be able to clear 2 ll-retres.
1%

.rtari.tric:

t {

SE(P)

2.9.7

Some more terminologY


5% level, it would be said that tl 'statistically significant'. If the nr-rll l'rypothesis is n' result of the test is rejected, itwould be said that the result of the test is'not significant"

lfGi_ :
I

0.5091

_1.e17

If the null l.rypothesis is rejected at the

level (so that 5% of the total distribution is placed in each of tl.re tails for this fwo.sided test), the required critical value is t211111"y, *1.72-5. So = llow as the test statistic lies in the rejection region, H6 woulcl be r.ejectecl. In order to use a 109'o test under the confidence interval appr.oach, the interual itself would have to have been re-estilnated since the cntical value is entbedded in the calcr-rlation of the confidence interval. So the test ofsignificance and confldence interval approaches both l-rave their relative merits. The testing of a number of clifferent l1lpotSeses is easter under the confidence interval approach, while a copsiclelation of

as above' The only thing that changes is the critical r-value. At the 10%

'

thatitis.insignificant''Finally,ifthenullh}'pothesisisrejectedattj
the result is terned 'liighly statistically significant" Note that a statistically significant result may be of no practical s nificance. For example, if the estin'rated beta for a stocl< under a CAI' regression is 1.05, and a null hypothesis that fJ:1is rejected' the I'esr will be statistically significant. But it may be the case that a slightly high bcta will r.nake no differcnce to an investor's ciroice as to whethcl to i) the stocl( or Ilot. In that case, one rvould say that the lesult of the t' was strtistically significant but financially ol' plactically insignificant.
1% level,

lntroductory Ecotlunetrics fot I:inance

A lrritf trvLrritrv n/thc 'Lissi.rrl Iincar rLyrcssion nodtl

Table

2.3

Crassifying hypothesis testing errors and correcr conclusrons Realiry

*
reduce the chances of both is to incrcase the sample size or to selei a samplc with more valirtiotr, thus inclcasing the antount of infornr' tion upon which the lesults of thc hypothcsis test are based. In practicr llp to a ccIt;rit'r levcl, typc Iclrors:rre usr"rally consitlerecl t.note seriot irncl hence a stuall sizc'ol lcst is usually choscn (5% or 1% are the nlo'
conl lllon
).

Result of

test

Significant (reject H1y1


Insignifican t (do nor reject Ho)

i! true Typelcrlttr':a
Hs

H6 is false

,/

-lyl)('Il

-_

(llol .- I

2'9'8

Crassifyingthe errors that can be made using hypothe.srs tests H1, is usually rejected if the test statistic is statistically sienifica't at choser significance rever. Trrere are rr,vo possible errol.s that co'rd
(1) Rejecting (2) Not
H11

be urade:

when
Hrr

it

'ejecting

whe' it

was reaily true; this is called atylte I error. was in ract false; this is cairecr a 4,pe II crror.

Tlie possible scenarios can be summar-isec-l in table 2.3. T'e probabiliry of a type I error is just cr, the siguifica'ce level or srze oftest chosen. To see this, recall what is meant by,significance'at the s% level: it is only 5% likely that a result as or lrore extrerne as this coulci have occurred purely by chance. Or, to put this another way, it is only 5% likely that this null would be rejected when it was in fact true. Note that there is no chance for a free lunch (i.e. a costless gain) here! \Mrat happe's if ti-re size of the test is reduced (e.g. fi-om a 5% fest to a 19/o test)? The chances of making a rype i error wourd be reduced. .. but so u'ould the probabiliry that the null hypothesis r.r'ould be r.ejecteci at ail, so i'creasing the probabiriry of a rype II error'. Trre two cornpeting effecrs of reducing the size of the test can be shown rn box 2.g. So there always exists, therefore, a dir-ect trade-off between wDe r and rype II errors wrren crroosing a significance lever. T)re oury..v to

Thc plobability of a rypc I ct'r-ot' js tJrc plobability of incot'rcctly rejct ing a collcct nrrll hypothcsis, which is rlso the size of tl're test. Alloth( inrportrnt pie'ce of tet'nrittolctgy in thrs ;ilca is the pot't,er 0f .l lc.sl. The pow, ofa test is dcfincd as thc plobability of (apploprirtely) r'ejecting atr itrcc, lect null hypothcsis. The powcr of the test is also eqr-ral to one lttlltus tll probabiliry of a type II c'r't'or'. An optinral tcst wouid be otre with an actual test size that ttlatcht the nominal size ancl which hacl as high a power as possible. Such a te would imply, fbl ex:rntple. thet r-rsi:rg a 5'./, significatrce level wouid resr.i in the nr.rll be ing lcjected cxactly 5",' of the tin-re by chattce alone, atr that an iucoLLect nr-ri1 hypothcsis rn'ould be rejected close to 100% of th time.

2.LO A special type of hypothesis test: the l-ratio


that the folnrula under a test ofsignilicance approach to hypothes resting using a r-test for thc slope parameter was
Recall
le.\l

slalt.rlt(:

'-......'.-..:T

fi-f'.
sr(p)

(2.3

with the obvious acl.tustlnents to test a hypothesis about the intercept.


the test
is

H1y:f:0 H1 :Bl0
i.e. a test that thc population parameter is zero r1;aiust a fwo-sided altt native, tl-ris is knor.vn as a r-ratio tcst. Since p- :0, the cxpressiou in (2.3 collapses to
l(.\l.\l(1lt\lt( :
11

Less likely Lower to falsely +chance of Reduce size+More strict +Reject null/re)ec| type lerror of test (e.g. criterion for hypothesjs\
5%

ta

I%) rejection

less

often

N4ore likely

to

Higher

incorrectly +chance of
not

.s[(n

12.:l
)

reject

type ll error

Thus the ratio of the coefficient to lts stalldard cn'or'. given by th cxplcssion, is l<nou'r.r as the t-r1lf i0 or l-srdti.stic.

Introductory Econometrics for Finance

b:'

F'"
F'.
F

A brief oveniew oJ tlte classical lincar rtgrcssiort

model

6t

Example 2.5

that we have calculated the estimates for the inter.cept ancl the slope (1.10 and -19.88 respectively) and tl-reir corresponding standard errors (1.35 and 1.98 respectively). The r-r'atios associated with each of the
Suppose

I
I

intercept and slope coefficients rvould be given by

.tt r-r'atio
Note that

Cocfficient I.l0

n7
-t9,88
t.98
10.0J

Some aurhors placc the r-ratils itt patctrthcscs below the correspondin; coefficient estimates rather than tl're standard errors. One thus needs t( check wl'rich convention is being r.tsed itr each particular application, anr also to state this clearly whcn pt'esenting estirlatior-r I'esults. Thele will now follow two finance casc studies that involve only thr estimation of bivariate lineal reglession models and the constructlon an( interpretatiorr of r-rlttios,

I..15 0,8i -

f - I, is equal to 15-3=12. The 5% cr.itical value for this fwo-sided test (renember, 25% in each tail for a 5% rest) is 2.179, while rhe 1% rwo_sided critical value (0.5% in each tail) is 3.055. civen these /-ratios and critical values, would the following null hypotheses be rejected?
H0:cr:0? Hs: B-0?
(No)
(

a coefficient is negative, its r-ratio will also be In order to test (separately) tl're null hypotl.reses that tr : 0 and'egarive. I = 0, the test statistics would be compared rvith the appropriate critical value frorn a l-distribution. In this case, the nuntber of degrees of fi.eedorn, given by

if

2,tl

An example of the use of a simple l-test to test a theory in

finance: can US mutual funds beat the market?


Jensen (1968) was the first to systelnatically test the peLforutattce of

nutt'llr particular exatttine whether any'beat the market', He uset funds, and in a sample of anuual returlls on the poltfolios of 115 nrutuai frtnds fron 1945-64. Each of the 115 fi-rnds was sub.jected to a seParate OLS time serte rcgression of the forrtt
Rl,

R1

= aj *

lJi(R,,,

Rx)

ttlr

les)

If H6 is rejected, it would be said that the test statistic is signr/icnnf. If the variable is not'signiflcant', it means that while the estinated value of the coefllcient is not exactly zero (e,g. 1,10 in the exantple above), the coefflcient is indistinguishable statistically from zero. If a zelo were placed i' the fltted equation instead of the estimated value, this would lnean that whatever happened to the value of that explanatory variable, the dependent variable would be unaffected, This would then be taken to nean tltat the variable is not helping to explain variations in ,r, and that it could therefore be removed from the regression equation. For example, if the rratio associated with.r had been -1.04 rather than -10.04 (assuning that the standard error stayed the same), the variable would be classed as insignificant (i,e. not statistically different from zero), The only insignificant term in the above regression is the intercept. There are gooci statistical reasons for always retaining the constant, even if it is not significant: see chapter 4.
worth noting that, for degrees of freedom g.eater thar al.or.ircl 2s, the 5% rwo'sided critical value is apploximately *2. so, as a rule of thunrb
is (i.e. a rough guidel, the null hypothesis would be r.ejected exceeds 2 in absointe value.

where Ru is the return on pol'tfolio / at time t, R7i is the return on risk-free proxy (a l-year government bond), R,,,, is the return on a ]nal ket portfolio pl'ox!, rr;1 is an error term, and a1, p.i arc parameters to b' estimated. The quantity of interest is the significance of (IJ, since thr parameter defines whether the fund outPerForms or underperforms th

market index. Thus the null hypothesis is given by: H6 : ai :0. A positiv' and significant a, for a given fund would suggest that the fund is abl to earn significant abnormal retufns in excess of the market-required r' turn fol a fund of this given riskiness' This coefficient has becone knowr as'Jensen's alpha'. Sorne su[Ilrlary statistics across the 115 funds for th estinated regression results for (2.52) are given in table 2.4.
Summary statistics for the estimated regression results for (2.52) Extremal values
Mean

Table

2.4

value

'

Median value

Minimum
- 0.080
0.219
10

Maximum
0.058 1.405
20

It

ri, l!
Sample

-0.011
0.840

-0.009
0,848
19

size

17

if the r-statistic

Sorrric: Jcnsen (1968). RePrinted

with the pcrnrissioll of Blaclcvc-ll

Ptrblishc'ts

68

Intrldu.t()ry Econometrics t'ot' [:inance

A brief ovttricw of the classicul li:iear rcgrcssiott rnodal Summary statistics for unit trust leturns, January 1979-May 2OO0
Mean
(%)

ESEftN
Freq uency

j-i

Table

2.5

distribution of t-ratios of mutual


ur

Minimum
(%)

Maximum f/")

Median
(%l

ru orPr rd> \Brus5

of transact ons costs) source: Jensen (1968), Reprinted with tlre permission of Blackwell Pub ishers

.i {)

Average nronthly

retunl, 1979-2000
Stlnd:rld dcvration of rcturns over tlnlc
li
J.I

1.0

6.9

-l
t-ratio

EEil
Freq uency

dlstribution of t-ratios of nrutual fund alphas (net of


t/dl5duL u r> LU>t5 j
Jer.rsen

Thc appt'opriate clitical valrte fot' a two-sided tcst of d/ = 0 is approx' ir-nately 2.10 (assr.tn-ring 20 yeals of annr.tal data leading to 18 deglees of freedom). As can be seen, only five funcls have estimated r-ratios gleater tltan 2 and are therefore intplicd to hrvc been able to outPelform the r.uarket befor.e transactiolts costs a|e takeu illto accollnt. IDterestingly, fir'e firms hlve also significantly unde|per'fortned the tnalket, with /-ratios

Source:

(1968). Reprlnted wrth the pernTission of Blackwe


Publis he rs

l:
10

\\/lren transactions costs arc tal(en into account (figure 2 18)' only oue fr-rncl out of 115 is able to sigr-riflcantly outperfornr the trarl<et, rvhile 14 significantly utrderperforn-r it. Given that a uouritral 5% nvo-sided size of test is beir]g used, one rvould expect rwo or thlee fr-rnds to'significantll' beat the market'by cha|rce alone. It worilcl thus be concluded that, during the sanple pel.iod stndied, uS fund lllaltagers appcaled ullable to systen'
atically generate positive abnormal retLlrns.

of -2 ol

less.

.r

-l

-:

I
,-ratio

2.t2

Can UK unit

trust managers beat the market?

As tabie 2.4 shows, the average (defined as eithel the nrean or the nredian) fund was unable to 'beat the market', recording a negative alpha in both cases. There weLe, however', some funds that did ntanage to perfornl significantly better than expected given their level of risk, with tl.re best fund of all yielding an alpha of 0.058. htterestingly, the aver.age fund had a beta estimate of ar-ound 0.85, indicating that, in the CAPM context, most filnds wcre iess lislq'than the malket index. Ti-ris resuit utay be attributable to the fr.rnds investing predon-rinantly in (mrtule) bJue chip stocl<s lather than surall caps. The n-rost visual nrethod of presenting tire resr:lts was obtained by ploteach /-ratio catefloly fol the aipl-ra coefficient, fir'st gross and then net of transactions costs, as in figr-rr.e 2.17 :nd fiorrrc 2 l8 rernert ivg]r,'.

a method for conducting Jensen's study has proved pivotal in sr-rggesting erlpir.ical tests of the perfo|rnance of fr,rnd nanagefs. Hower,er, it has been

ting the nulnbeL of ntutual funds in

criticised on several grounds. One of the most illlportallt of these in the context of this book is that orlly befween 10 and 20 aunual obselvations rvere used for eacir leglessioll. Such a smail llunlber of observations is r.eally insr-rfficient for. the asynptotic theol'y underlying the testing pfocedure to be rralidl\' invoked A variant ou Jenseu's test is now estimated in the cot]text of thc Uii market, by conside|ing montl'rly retunts on 75 equity r,rnit tt'ttsts. The clata cover the pelioci Jantlary 1979-May 2000 (257 obselatious fo| erch tirucl). Soure sLillltlafy statistics for tire funds at'e pt-esentcd il'r table 2 5 Flor.n these sr111111ary statistics, the avefage colltintlor,lsly cotttpoltDde'cl

r.etufl is

1.0%

pel nronth, although the lllost iutet'estillg tertulc is the

70

Introductory Econometrics

lar

Finance

A lu'icf ot,tr-victv oJ thc classical lirtcor

rcgrc.s.siott rnodcl

Table

2.6

c,APlr4

regression resuits ior unit trust returns, Januarv 1,979-Mav 2000

Eslimates
u(Vc)
P

of

Mean

Minimum

Maximum
0.33
1.09 3.11

Median

(1) Ihatthe'overreacttoneffect'isjustanothermanifestationofthe'sizeeffect',Thesize
effect is the tendency of small firms to generate on average, superior returns to large firms. The argument would follow that the losers were small firms and that these small firms would subsequently outperform the large firms, DeBondt and Thaler did not believe this a sufficient explanation, but Zarowin (1990) found that allowing for firm size did reduce the subsequent return on the losers'

*0.02
0.91 On a

t-Iatio

*0.07

0.54 0.56

,0.03
0.91

0.2s

(2\ Thatthereversalsoffortunereflectchangesinequilibriumrequiredreturns.Thelosers

Ill?[EEfil
Performance of UK unrt !rusts.
197
3000
2500

are argued to be likely to have considerably higher CAPM betas, reflecting investors' perceptions that they are more risl{y. Of course, betas can change over time, and a substantial fall in the flrms' share prices (forthe losers) would lead to a rise in their leverage ratios, leading in all likelihood to an increase in their perceived risklness Therefore, the required rate of return on the losers will be larger, and their ex post performance better. Ball and Kothari (1989) find the cAPM betas of losers to be considerably higher than those of winners.

9-2000

2000
1500

2.13
1000 500

The overreaction hypothesis and the UK stock market


Motivation Trvo studies by DeBondt and Thaler'(1985, 1937) showed that stocks exp( liencing a poor perfornrance over a 3-s-year period subsequently tend 1 outperform stocks that had previously perforrred lelatively well. This in' plies that, on average, stocks which are 'losers'in terms of their return subsequently become 'winners', and vice versa. This chapter now exall ines a paper by Clare and Thomas (1995) that conducts a sinrilar stud using montl.rly UK stock returns from January 1955 to 1990 (36 years) or

2.73.7

rud*s *JoP*Sg*q**tr"C.*tr"*f,"rJ{,*.*rygf*tr"C*Jc],.t*

wide variation in the performances of the funcls. The worst-perforrring fund yields an average return of 0.6% per month over the 20-year- pe_ riod, while the best would give 1.4% per nonth. This variabiiiry is further demonstrated in figure 2.19, which plots over tin.re the value of 100 invested in each of the funds in January 1979. A regression of the form (2.52) is applied to the UI( data, and the sun_ mary results presented in table 2,6. A number of features of the regression results are worthy of further comment. First, rnost of the funds have estimated betas less than one again, perhaps suggesting that the fund managers have historically been risk-averse or in'esting disproportionately in blue chip conpanies in mature sectors. second, gross oftransactions costs, funds of the sample of 76 were able to significantly outperforrn the 'ine market by ploviding a significant positive alpha, while seven funcls yielded significalrt negatirre alphas, The average fund (r.vhere 'average' js measured
using either the mean ol the median) is not able to ear. any excess retul.n over the required rate given its level of risk.

all firms traded on the London Stock exchange. This phenonrenon seenls at first blush to be inconsistent with the
(box 2.9).

ef1

cient marl<ets hypothesis, and Clare and Thonras Proposc rwo explanettotr Zarowin (1990) also flnds that 80% of the extra return avallable frot' holcling the losers accrues to investor.s in Janualy, so that almost all c the 'overreaction effect'seems to occur at tl-re stalt of the calendal yeal'

2.13.2

MethodotogY

random sample of 1,000 firtrls and, fot' each, thc calculatc the nontl-rly excess retuflt of the stock fot'the tlrat'kct ovel-a 12 2,1- or 36-nlonth prct'iod fbr eaclt stock
Ciar-e aud Thotnas take a
1

LJi,= Ri,-

R,,,,

|=

1.,...ri;

: 1...., I0(Xl: tt : 12.24 or' .16


I

Introductory Eanometrics

for

Finance

A hriel ovcru/crl

rt.f Lltc classtcul

lirrlrrr rcsre.ssion rnodcl

Table 2.7

ls there an overreaction effect ln


Panel A:

,n" ,^ stock market?


,, tl

Pottfolio Portfolio 1 Portfolio 2 Porttolio 3 Portfolio 4 Portfolio 5

Ranking

Best pedorming 20% of lirms


Next 20% Next 20% Next 20% Worst pedorming 2O% of firms

All Months
0.0033 0.0036

Return on loscr Returrl olt rvinncr


Inrpliecl lunualised

rctun) (lIffctclr(c
ri,;

Clocfficicnt frrr (2.55):

0.3 7'l"

-0.00031
(0.2e)
,- 0.00034

rr=lJ 0.001 1 -0.0003 L68'l( 0.0014 '' (2.01) (2.01) 0.010 (0.21)
0.00147'-

ri="10
0.0129 0.011s
1.561,

0.0013

(l.ss)
0.0013
(1.41)

Cocflicie nts for (2.5(r):


Estimate R; for year 1 Monitor portfolio s f or year 2 Estimate i, for year 3
:

ri,.

(* 0.30) Coefticients for (2.561. p


Panel B:

-0.o22 (- o.2s)

*0.002s {*0.06)

all months except January


d1

Monitor portfolios for year 36

Coefficient fbr (2.55):

-0.0007 0.0012* (--0.72) (1.63)


rnd

0.0009 (1.0s)

\otcJ: /-ratios in parentheses;

"

dcrrote significancc at tl'rc

10-o/n

and 5% levels.

Then the a\/erage monthly return over each stock 36-month peliod is calcr:lated;
tx p '\-l "r -..- / |t:1 vtt

i for frre nrst 12-, z4-, ol

respectively.
Sorrlcc: Clarc and Thornas {1995).

Rcprintcd wlth the permission of Blaci<u'ell

Publishers.

(2.s4)

The first leglession to be perfoured is ofthe excess LetuLu ofthe loser over the winners on a constant only
Rp1

and loser portforios (the top 20% and bortor.r.r 20% of firms i'the por.tfolio folmarion per.iod) are denotecl by Rj,f and Rf-,, respectrvely. Define the diffelence bef'"r,een these as Rn, = R!,, * n])

periods a.d 18 i'dependent traci<ing periods. By similar.argunents, /r : 2 gtves 9 independent periods and n = 3 girres 6 ilclepenclent per.iocls. Tl_re r'eturn for each rnonth over the 1g, g, or. 6 periocrs for. the wir'er

The stoci(s are then ranl<ed frorn highest avefage r.eturn to lowest and fro'r these 5 portforios are formed and returns are carculatecr assurning an equal weighting ofstocks in each portfolio (box 2.10). The same sample rength ri is used to monitor. the perfbrma'ce of each portfolio. Thus, for exanple, if the portfolio formatron period is one, two or three years, the sr-rbsequent portfolio tracking period ,"vill also be one, two or three years, respectivery. The' another portforio forrnation period follorvs and so on until the sample periocr has been exhaustecr. How rnanv samples oflength n will there be? l : 1, 2, or 3 years. First, suppose rr : 1 year. The pr.ocednre adopted would be as shown in box 2.11. So if l : 1, there are 1g independent (non-overlapping) observation

:ctl*t11

rr'here 17, is an error telnt. The test is of r.r'hether rrt is significaltt alt, positive. Hor,r'ever, a significant and positive or is not a sufficient cor-rditio for the overleaction effect to be confirrned because it could be owing t higher reti.u'ns being required on loser stocks owing to loser stocl<s beiu. tnore risky. Tire solution, Clare and Thomas (1995) argue, is to allow fc risk differences by reglessing against the ntarl(et I'isl< premiunr
Arr,

o, + fl(.tl,,,t - Rr,) +

,,

12.t,

'er'

rvhere R,,,, is the leturn on the F-TA All-shale, and R1, is the re-turn on UK gover-nmcnt thlcc-rronth Treasuly Bill.'fhe results fbr r.rch of tires frt'o |eg|essions are prcsented in table 2.7, r\s can be seen by compaling the rctr.rr-ns ou the wirrirc.r's and losels i' thc first two ro\\rs of tablc 2.7, 12 ntonths is uot a sulficit'ntly long tnl fot'losers to become winneLs. By the f"vo-yeat't1':lckiug holizon, hon'evc the losels have become r,r'iuners, and sir-nilarly fol the thrcc-ycar srrtrple: TI'ris translafcs into an average 1.68% highel retul'll olt the losels than tlr

74

Introductory llconometrics for Ftnance

A brief ovcrvitt^,t o.f the classical lintar


$

rL:y,ression ntorlel

winners at the t\,vo'veai l-rorizon, and 1.55% higher return at the three-yeli hcrizon. Recall that the estimated value of tl-re coefficient in a regr.ession ofa variable on a constant only is equal to the average value ofthat var.!able. It can also be seeu that the estimated coefficients on the coltstant tern'rs for each horizon are exactly equal to thc differ.ences between tire lcturns of the losers and the wiurtcrs. This coefficient is statistically siglificant at the two-year horizon, and marginally significant at the three-year horizon. Iu the second test regression, I represents the differ.ence behveen the market betas of the winuer and loser portfolios. None of t5e beta coeffi-

Table

2.8

Part of the EViews regression output revisited

Coefficient

Std.

Error t-Statistic
0.817569
0.L)257

Prob. 0,4167
0.3 581

c
RFiJTURIiS

0.363302 O, 123860

0.444369 0. r 33790

81.

In flact, the uLrll would hlrve becn lc'jccted at the 12% Ievel or hight To see this, consider conducting a selies of tests with size 0.1%,0.2'
0.3y", 0.4%, .

cient estimates a1'e even close to being significant, ancl the ilclusiol of the risk term tnakes virtttally no difference to the coefficielt values 6r significances of the intercept terns. Removal of the January returns from the samples reduces the subscquent degree of overperformance of the loser portfolios, and the signiF icances of the ri1 tel'lrrs is somewhat reducecl. It is concluded, thelefore, that only a Part of the overreaction phenomenon occuls in Ja11aly. Clare and Tiromas then proceed to examine whetlter the overreaction effect is related to firm size, although the results are not presented here.

..

1%, .

..,

5%,

..

10%,

...

Eventually, the critical value and te:

2.13.3

Conclusions The main conclusions from Clare and Thomas'study are: (1) There appears to be evidence of overreactions in ul( stock returns. as found in previous US studies. (2) These over-reactions are unrelated to the CAPM beta. (3) Losers that subsequently become winners tend to be srnar, so that most of the overreaction in the ul( can be attributed to the size effect.

statistic will n-reet and this will be the p"value. p-values ale alnrost ahval provided aLrtonlatically by softrva|e packages. Note horv usefttl they ar, They provicle all of the information recluired to conduct a hypothesis te: withor.rt requiring ofthe IesealcheI the need to calculate a test statistic c' to find a critical value from a t;rble - both ofthese steps have already bee t;rken by rhe p.tcl<rge in prodtrcing thc 7,-v3lus. The 7'-value is also usetl since it avoids the requir-enrent of specifliing an arbitrary significanc levei (a). Sensitiviry analysis of the effect of the significance level on tir conclusion occurs automatically. Informally, the 2-value is also often referred to as the probabilfty c being wrong when the null lrypothesis is lejected. Thr:s, for example, if p-value of0.05 or less leads the researcher to reject the null (equivalent t a 5% significance level), this is eqr.rivalent to saying that if the probabilit
n innnrrartlrr roia

p-value has also been terrrred the 'plausibiliry' of the r.ru1l hypothesis; st the smaller is the p-value, the less plausible is the null hSpothesis.

^-rrcting the null is more than

5%,

do not leject it.

T1t

2.I4

The exact significance level


The exact signiflcance level is also commonly known as the p-value. It gives the ntarginal significance lelel where one would be indiffer.ent befween rejecting and not rejecting the null hypothesis. Ifthe test statistic is ,large' in absolute value, the p-value will be small, and vice versa. Fo' example, consider a test statistic tl'rat is distributed as a /(,2 ar.rd takes a valne of 1..47. wo'ld the null hypothesis be rejected? It would depend on the size of the lest. Now, suppose that the p-vaiue for this test is calc'rated to be 0.12:

2.15

Hypothesis testing in EViews

example 1: hedging revisited

Reload the'hedge.wfl'EViews worl< file that was created above. lf rt re-exanrine the results table from the returns regression (screellshot 2. on p.43), it can be seen that as well as the parameter estinlates, EVie\\ autonatically calculates the standard errors, the /-ratios, and the /)-valltt' associated with a two-sided test of the null hypothesis that the tlue valLt

of a paraneter is zclo. Part of the results table is |eplicated agaiu

iret

s Is the nr.rll rejected at the i07; level? e Is the nr.rll rejected at the 20% level?

*. Is the null lejected at the 5%

level?

No No
]'e_s

(table 2.8) for ease of interpretation. The third columu presents the /-ratios, which a|e the test strltistics fi' festing the null hypothesis that the true values of these par-anlcters lll zero against a rwo sided alternative - i,e. these statistics test flo : t{ : (} 1t1' slls Ht :a I0 ir-r the first t'ow of numbers and Hrr :d :0 versr"ts Hl : f =

I I

76

Itrtroductlry F.conontctrics Jor Finutce

,4 briclovervicil, of the classicul Irnecr rctn'-ssron nrodcl

Testsfi'^lald

the seco'd. The fact that these test statistics ar.e both very sn-ralr is in_ dicative that neithcr ofthese'ut hypotheses is lir<ely to be rejected. This conclusion is co'firmed by the 7r-values give' in tl-re firal colu'r'. Both 2values are corsiderably large. than 0.1, iudicating that the co'.esponcling test stafistics al.e llot even significant at the 10% lcvel. Suppose now that we wantecl to test thc nuil ltypothcsis that FI0 I i4 = 1 lrthel than Ho . t'l = 0. We cor.rlci tcst this. or.any other hypothcsis ;rbout the coefficie'ts, by hand, usi.g trre info.nration wc alr.cady have. Br"rt it is easref to ret EViews do the wo'k by typing view ard then coefficient

i'

*
Wald Test:

Lquation: REI1IRNRITC
Test Statistic
F-stat i sti
Chi-sqr.ra
c:

Valtre
,12.8 81 5 5
4

Probability
{1. (r3)
1

lc

2.884 55

0.0000 0.0000

Nr,rll Ilypothcsis Surrrrn;rry:

Nolrtrrlisccl llestljction

(:

0)

Std.

I}'r

ete's

i'a

also nonlinear restrictions, which car'ot be tested plocedure for. infer.ence described above.
Wald Test: Equation: LE\,TLREC
Test Stafistic F-statistic Chi"square Value 0.56s298
0.565298

slope' Type c(2)=1 and clicr< or(. Note that usir.rg trris software, it is possible to test ur'ltiple hypotheses, whicli wiil be discussed in crrapter.3, arcr

- coefficie't Restrictions.... EViews clefines all of thc ]);rra'vecto. c' so that c(1) will be rrre ir,ter.cepr ancl c(2) wilr be rhe

I + C'(l)

0.8761.10

0.133790

Restlictions arc lincar in cocfficients.

usi's thc stardard

2.16

Estimation and hypothesis testing in EViews the CAPM

example 2:

df
11,,
1

Probability
64)

0,4549 0.4s21

Nu)l Hypcthesis Summary:


Nolrnalised Restriction 1=
01

This exercise r,r'i11 estimate atrd test sonrc hypotheses about the CAPM beta for several US stocks. |ilst, Open a new worl<file to accourntodate monthly data contmencing in January 2002 and cnding in April 2007. Then import the Excei file'capm.xls'. The file is olgatrised by observation and contains six colurnns of numbers plr.rs the clates in the fir'st colurnn, so in the 'Nanres for selics or Nuurber if named in file'box, fype 6. As befole, do

Value -0.0"t7777

Std. Err. 0.023644

-'t +

c(21

Restrictions are linear in coefficients.

not inrpolt rhc dates so thc data sran in cell 82, Thc nionthly stocl< priccs of four companies (Ford, Cene ral Motor-s, Microsoft and Sun) will appeal as objects, along rvith index values fof the S&P500 ('sandp') ar-rd three-month US"Treasury bills ('ustb3ur'). Save the EViews worl<file as 'capm.wk1' In order to estimate a CAPM equatiolt fol the Ford stock, for exarnple, rr'e need to first transform the price series into leturns and then the excess returns over the risk free rate, To transform the series, click on tire Cenerate brrtton (Genr) in the rvor'l<file window. In the ncw window. rype
RSANDP=100-LOG(SANDP/SANDp( *
1

only one paraneter, the two test statistics ('F-statistic'ancl ,x-sqrare,) will always be ide'tical- Trrese a'e equivalent to conducti'g a r-test, a'd these alternative fo'nulations wiil be discussed i' cretail i' chapter 4, EVieu,s also reports rhe ''ornralised.estriction', althor"rgh tl-ris can be ig'o'cd for. the tirne being si'ce it'rerely feports the regressron srope para'retc,r.(i' a differcnt folm) and its standar.cl err.or.
bacl< to the regression in re'ers (i.e. r,vith the r.arv pr.ices r.atirc' than tlte retr.rrns) and test the null hypothesis tl.rat 19 = 1 in this regr.ession. You should find in this case that the nLrll hypothesrs rs nor r.eiectecl (t:rble.

Tire test is performed in two different ways, but results suggest that the null hypothesis srrould creariy be rejected as trre p-value for the test is zero to four deci'ral praces. since we are testi'g a hypothesis about

))

This rvill create a new series named RSANDP that will contairl the returns of the S&P500. The operator (-1)is used to instrlrct EViews to use the onepcliod laggcd obsen,ation of the selies. To estimate percelttagc returns ol-r the Fold stoci<, pless the Genr button again and rype RFoRD=100-LOC(FoRD/FORD(
1))

Now go

bclow;.

This will yield a nerv selies named RFORD thrt will contain the reluurs of the Fold stocl{. EViews allorvs valious }<incls of tlansfolmrtions to the

Introductory Econon'tetrics Jor Financc

A brief ovewtew of the classicol lincar regrr.s.sir;n ntodel

se.r'ies. For

example
creares a new variable called X2

X2=y12 XSq=X^2 LX:LOG(X)


IACX=X(-

that is lialf

ofx
creates a new va|iable XSqthat is X squared creates a new varirble LX that is the log of X creates a new var-iable LAGX containing X lagged by one per.iod creates a new variable LAGX2 containing X lagged by two periods

the series appear to trtove togcthct'. l'o do this, cl'cate a new object b' clicking orl the Object/New Object ncnLr on thc mcllu bar-. Select Graph provide a nalte (call thc grapli Graphl) and thcrt in the new windov provide the nanres of the scrics to l)lot. Itt this :rew willdow type
I]RSANDP I]RFORD

l)

Then prcss OI( and

scLce

nshot 2.4

will

uppear.

LACX2=X(*2)

Other functions include:

d(X) d(X.n) dlog1x)


dlog(X,n)

abs(X)

flrst difference of X rrth order difference of X first difference of the logarithm of X lth order differ-ence of the logarithnr of X
absoiute valr-te of X

If, in the transformation, the new series is given the sanle narne as tl.re old series, then the old series will be overwritten, Note that the retllrr-is
for the S&P index could have been constructed using a sin-rpler conmaud in the 'Genr'window such as
RSANDP= 100- DLOG(SANDP)

,1,10*!,,+-ui11.,]\

1"
I

as we used in chapter 1. Before we can transform the returns iDto excess returns, we need to be slightty careful because the stock returns

are monthly, but the Treasury bill yields are annualised. We could run the whole analysis using monthly data or using annualised data and it should not matter which rve use, but the two series rnust be rneasured consistently. So, to turn the T-bill yields into monthly figures and to write over the original series, press the Genr button again and type
USTB3lr{=USTB3M/12

ERSANDP

ERFORD

Now to compute the excess returns, click Genr again ancl type
ERSANDP=RSANDP-USTB3 l\{

where 'ERSANDP'will be used to denote the excess r.eturns, so that the original raw returns series will remain in the workfile. Tl-re Ford returns can simiiarly be transformed into a set of excess returns. Now tirat the excess returns have been obtained for. the fwo series. before rr,rnning the regression, plot the data to examine visually whethe|

This is a time-series plot of the fivo variables, but a scatter plot may Lr more informative. To examiue a scatter plot, Click Options, choose th Type tab, then select Scatte[ from the list and click OI(. There appears t be a weak association befween ERF-|AS and ERFORD. Close tl're windorv t the graph and return to the workfile window. To estimate the CAPM eqr-ration, click on Object/New Objects' In th nerv window, select Equation and nante the object CAPM, Click otl Ol ln thc window. spccifu the reglessiott eqr.ration. The lcg:'cssiott cqrt.tric takes the form
(R-.;ur

r'1),

tt -l ll(Rv

r'1)1

tr1

lntrodLtctory F.cotlometrics

fot' Iiinance

Ltrtef otet'vitt't ry' tlrc ciassirril

littcrl' tt'grcsston

ttloricl

81

Since the data have already been transfolnrecl to obtain the excess returns,

population coefficient is How coulcl the hypotiresrs that the value of, the eqlraltolbctestc(l?Thcanswet.istocljc]<orlView/CoefficierrtTestsilVald appeals' Type C(2)=1 - Coefficient Restrictions " rtncl thetr in tl.rc box that that thc CAPM beta of Fold The coucltrsiorl here is that tllc ntrll llypothe sis beta of 0 359 is not stock is 1 calillot bc t'elcctccl :ttrcl hcl'tcc thc cstinlatcd signilicrntly clitlctt'nt {tottr
Key concePts

in order to specifo this regression equ;rtion, type in the equation window


ERFORD C ERSANDP

To Lrse all the observatior-rs in the sanrple and to estirnate fhe regression using LS - Leasf Squarcs (NLS and ARMA), click on OK. The lesu]ts scLeen appears as in the followiug table. Mal<e sule that you save the Worl<file again to includc the transfonned sclies ancl lcglcssion leslrlts!
Dependent Variable: ERFORD

l'

nr.

Method; Least Squares


Date: 08/21/07 Time: 15:02 Sanrple (adjusted): 2002M02 2007M04 lnclr-rded observrtions: 63 aftel adjustnrcn(5

fl'onr this cllapter are t *y tefn)s to be able to clcfinc ancl explain o resrcssion modcl a distLlrbancc terll'l ft populatioll

s litrear tnodcl c unbiaseduess


Prob. 0.4736 0,6523

Coefficient
C

Std.

Error t-Statistic

ERSANDP

2.020219 0.359726 0.0033s0

2.801382 0.7211.s7 0.794443 0.452803


Mean depenclent var S.D. dependent var Akaike info critelion
Schwarz criterion

t: rlull liyPothesis $ /-distribution 6 test statistic e tYPe I error


i* size of a test
13

R-squared

2.09744s
22.05129

r\djusted R-squared
( F nf reorpssinn Surr squared resid Log likelihood
F-statisti c

-0.012989
22.79404
30047.09

9.0687s6

/)'value asymPtotic

e conslstclrcy * efficiency s statistical inlcrence '! altcrllative llypothesis r$ cotlfidence irlterval 4" relectlolt reglon {t lYPc II el'[ol' * po\4/cr of a test ri data trrinil)g

9.t36792
9.09ss14 1.785599

-283.6658
0.20s031 0.652297

Ilannan-Qr-rinn criter Durbin-Watson stat

PrnlrlF-ctrtistir\

Appendix: Mathematical derivations of CLRM results


2A'TDerivationoftheoLScoefficientestirnatorinthebivariatecase t r
r-\-,',-i.tl " - /-'''

Take a couple of minutes to examine the results of the regl-ession. \Mrat is the slope coefficient estimate and what does it signify? Is this coefficient statistically significant? The beta coefficient (tlie slope coefficient) estinrate

-\-r,,-u-d.t,tr /2

(24.1

is 0.3597. The 2-value of the r-ratio is 0.6523, signifying that the excess return on the narket proxy has no signiflcant explanatoly power for the variability of the excess returns of Ford stock. !\hat is the interpretation of the intercept estinate? Is it statistically significant? In fact, there is a considerably quickel method for using transformed variables in legression equations, and that is to write the tlansformation dilectly into the equatiou rvindow. In the CAPM example above, this cor-rld
hp dnrrp lrrr hrrrirro DLOG(FORD)-USTB3M C DLOG(SANDP)-USTB3M As well as being quicker'. an ildvantilgc of th js lpproach is that the outpUt will shorv n-rore clea|ly the leg|ession thrt has actr.rally been condncted, so that any errors in nraking the transfoluratiorrs

d' and 1i' to fii'rd the values of d rn' /. is differentiated w'r'l I that give the line that is closest to the clata' So The first derivatives at'' set to zel'o a and r4, ancl the first c-lerivatives are given by

It is necessary to lllinimise L wl"t

;l/. _
jl;

_.fuI

1,, _o _ dr,r

rr

(24.,

-l : -tf'7 r,(t, * ri - /'lt,) : rfr

(24.:

into the equatiou window.

s Altlrorrglr the vlluc'0.159 ilnY scetlr a loilg rtry f)onl 1' cottsidt'rt'tl pttrc)y litrttt atr rlrd this llis lc(1 to r l'ttgc ecolometrjc Pc'rspcctlYc tllt'sinlPle sjze is quitc stllall l)()tll ll I : /J = {) iilld 31v61. rr'hich cxplains thc tlilure lo rciect l)lranrctcr rtina.ia

can be seen nlore clearly.

H,r:l=1.

-rf:

lntr o du ctory

Econonte tri c s

fctr r-inan c e

A brief oveniaw of the classicul linaor rtgrcssion ntodel s Derivation of the OLS standard error estimators f or the intercept and slope in the bivariate case Recall tltat the varirncc of the t';rtltlonl vat'irble ri can bc wt'itten rs
var-(r?)

The next step is to rearrange (2A.2) and (2A.3) in order to obtain expres_ sions for d and 1i. From (2A.2)

2A.2

T,r',-a-flt,1=o
Expandi'g the parentheses and .ecallirg that the sun runs so that there rvill be f ternrs in d
/_:

eA.4)

= t'.ltt -' I:([t))) : I:(u - u)J

(24.1'

fron 1 to r
(2,4.s)

and since the OLS cstitrtatot'is uttbirscd


v;rr'(n)

\-'.-r;-i\--, tu-t)/_\1=tl
f
.r',

{24.1(

But

ha and

Ir., = It,

By

sinrilal algulttcltts, tltc var.iattce of tl-re slope estittratot'cru be wt'tttc

so

it is possible to write

(2A.5) as
(2A.6;

as

t .\'- te _ -;- = il /f.\. of

var{ft: E(i-fllt
wolking first with (2A'17)'
the OLS c'stillator

(2A.1:

lcflliirrg

fJ

with tllc folnlr-tla for it given

l'

i-d-p.i:0
From (2A.3)

(2A.t)

ir lfr.t,-.irrr, r:rrrFr=L(--l) ' 1-"' Ir


Replacing.r,, with u

\'
I

(2A.t'

t,(.lr

- ri -

1ir,) =

(24.8)

* f,.t1{ tr,, atrd r"eplacing r with d + lli in (2A 18) r Il:,.tt, -,r -d.r, \-p) J L{.\,-\)=.: ij;\.i'

From (2A.7)
Aa:)._pir

v;rrtpr= u[ -

/\-r.r,-.tt(rr

(24.e)
r?

Cancelling cr attd tntrltiplyirtg t)rc 1;rst F relm in t24.19) by

ft.r', -.tlr
1-\'\t - \
|

Substituting into (2A.8) for

fiom
0

(2A.9)
(2,4.10)
r ',rt il) -

f.r,.r',

- i'+

i9.i

d',.r =

""''' -

..

"\I,.
,.1

- trf

r, -t- rt, - 51.t,-

{r,

* r)r

lLrt,

.it:\'

,r^

r,

I.,1, - r- f
Tr,,,

.., +

di

r,

d .

)-ri : o
L.t

(2/'.11,)

Rearrangtng

- Tr\' + BT.t:- lf"f


79,

12A.12l (24.2

Rearranging for
,\Lt|

B(r.tt- f.:)=ril-\--,,. /2^t.'1


(ri2 * f rj)
gives

(24.13)

''u(p)
Now the

"t t\ :
f
terms in (2A.22)
rr,t .i

\24.2

Dividing both sides of (2A.13) Uy

will

cattcel to give
\24.2

\-",,-rl=Fl_r,
/,''l

Lincl

d=i-/lL

(24.14)

I 5r, ,,,'.ri,_,,(E*; t\: )

lntroLlucLory Econonutrics Jor Finance

A bricf ctveniew rrf [/rc clossicol lineor rcgrassfun ntrfiel


s

a)

Norv let -rf denote the mean-adjusted observ;rtion for.r-,, i.e. (.r, tion (2A.23) can be rvritten

_.r

). Equa-

so that the stariclarcl error caD bc obtaincd by t;ikin'1


(24.30)

li)t

'::-.

'2:e :')()l

()f

^ lf 'l \' \Jl/g)=El 4 "''''


\ 2- ''- /
The denornirator of l2A.24l can be
tal<e

(21..24)

n tli'ough the expectations oper.

ator under the assuurptior-r that

r is fixc'd or. non-stochastic


'
(2n.2s)

r:,rrfr:

\\rriting the terms out in the last summation of (2A.25)


rar{p)

(I '")-

-r(f,,,r,)i

-: \' )\:'i\ Trrrnilrg now to tl-re clelivatiorr of thc intercept rtaiC:'rc '::"": fz'tl fact nrrrclt rlo|e clifficult tlian tliat of the slope 'i'"nde:t '::": ":' '" ':t:f ():t botlr are very nrtch easiel ttsiltg nllltrix algebra ai :h"'"';-r \': "': "'':'?Ti:'''\ tlris clerivation will bc offc|ed in sr'tttrtllary form il )t ?()t''
d
as a

httrction ol the tLtte u rtlcl of the distrrrbancr>

"::

: -]

(r.,')-

.,:(rr;.r'i *rr1.r..l

*...*r7..ri)l

121'.26\

Denoting all of the eleuretlts in sqttat'e brackcts as '"

l.r''"J :1"''

';i1t:'':rt

Now expa'ding the bracliets operatol of (2A.26)


raLrp;

of the squared tcrrr in the


+,,i.rir

expcctations

it Fron

ty

=I

,r,

t, ,,,t
rl ) : Itil'1,,i1: 'rf ''

:.:, 'i)

: /-l=E(Ll.r'i:

(I.';')-

+ ..+ iri.ii: +

(2A.15),

the illtercept vaLiance wortld be \\'nIiei

t.ro.s.r-1trotlut.r.s)

rur'(rir=
(2A.27)

/r l.(f

Writing

(2A.34)

out in fuil for ,tf and expanding rhe brac":

ivhere 'cross-products' in (2A.27) denotes all of tl.re rerms r/i-r.,:r/r.ri U + i). These cross-products can be written as i/i,/.ri.\; (l *.l) and tl.reir.expectation rvill be zero under the assun'lption that the error renns are uncorre. Iated \4/ith one another. Thus, the 'cross-products'term in (2A,27) will clrop out. Recall also from the chapter text that E(rri) is the error variance, rvhich is estimated using .sl

,,[r(t.,

)'

-,t',( I,;) L _iI

^I rrrlpy: **=

/)-..':)\Lt/

(.r:,rir +,r:.rl:

+ ..+.rr.i-i:)

el.z8)

r This looks rather colrplex, bllt fortullately, if rr'e iake the relnai:ling nunrer:' square brackets in the llulllerrtor, a lerln in the denominator to leave the required rc'ult

u'hich can also be u'ritten


r:rr.,,4)

= _=

(I';')
as .f-

(.r

j:,

.r.':

*... +.ri:) :

.',IIi-' (r.,.')'

l2A.2eJ

A term i'f .rf: can be cancelled from the nur-nerator and denonrinrtor of (2A.29\, and recalling tl.rat .rf = (.r, - .i ), this sivcs rhe var.ia'cc of thc
slope coefficient
;. \ill'(F): \,,

Review questions

1. (a)

Why does OLS

estrlration involve taking \et-'ta c2'

1,', 't

-._;

(2A.30)

points to the line rather than horizontal disia^ces: (b) Why are the vertical distances squared befc'e be'2 together?

'''--''

':-"

86

Introductory Econometrics Jor Finance

A brief ovewiew of the classical lintttr rcgression nodcl

(c) Why are the squares of the vertical distances taken rather than the
absolute values? 2. Exprain, with the use of equations, the difference between the sampre regression function and the population regression functron. 3. What is an estimator? rs the oLS estimator superior to ail other estimators? Why or why not? 4. What five assunrptions are usually rnade about the unobservable error terms In the classical linear regression model (CLRM)? Briefly explain the meaning of each. Why are these assurlptions made? 5. Which of the following models can be estimated (followrng a suitable rearrangenler'rt if necessary) using ordinary least squares (oLS), where X, :, Z are variables and ct, p, ), are parameters to be estimated? (Htnt: the nrodels need to be linear in the paranreters.)

:-t

)1=t}+f1.\',-.', : g" x!' e"'

market, at the 5% tevel. Wriie down the null and alternative hypOthesis What do you conclude? Are the analyst's claims empirically verified? 7. The analyst also tells you that shares in Chris Mining PLC have no systematic risk, in other words that the returns on its shares are completely unrelatecl to movements in the market. The value of beta and its standard error are calculated to be 0,214 and 0.186' respectively. The model is estimated over 38 quarterly observatl0ns' Write down the null and alternative hypotheses Test this nuli hypothesis against a two-sided alternative. 8. Forrn and interpret a 95% and a 99% confidence Interval for beta usintr the figures given in questiorr 7. 9. Are hypotheses tested concerning the actual vaiues of the coefficients (i.e. 0) or their estimated values (i.e. /l r and why? 10. Using EViews, select one of the other stock series from the 'capm wkl

(2.57)
(2.58) (2.5e) (2.60)

fiIeandestimateaCAPMbetaforthatstock.Testthenu||hypothesis
that the true beta ls one and also test the null hypothesis that the tru(
alpha (intercept) is zero. What are your conclusions?

.ri:o+py.r, 1-,,, ln(J/):a*fht(.tr)*u, -)i:a+8.r,:,-u1


6.
The capital asset pricjng model (CApM) can be written as

(2.6r)

E(R,): Rr + piLE(R.)

- Ril

(2.62)

usrng the standard notation. The first step in using the cApM is to estimate the stock's beta using the market model. The market model can be written as
R,,

= di

B; R,,,,

u,,

(2.63)

where R;, is the excess return for security i at time r, R,,,, is the excess return on a proxy for the market portfolio at time t , and u is an iid , random disturbance ternr. The cofficient beta in this case is arso the CAPM beta for security 1, suppose that you had estimated (2.63) and found that the estimated value of beta for a stock, f was 7.L47. The standard error associated with this coefficient SE{p) is estjlnated to be O.O548. A city analyst has told you that this security closely follows the market, but that it is no more risky, on average, than the nrarket. This can be tested by the nuli hypotheses that the value of ileta is one. The model is estimated over 62 daily observations. Test this hypothesis against a one-sided aiternative that the security is more risky than the

Você também pode gostar