Você está na página 1de 184

Last revision: June 2000

Time Series Analysis using


MATLAB,
Including a complete MATLAB Toolbox

By Jaime Terceiro, Jos Manuel Casals, Miguel Jerez, Gregorio R.


Serrano and Sonia Sotoca

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
How the toolbox works? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
Installing the toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Contents of the book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
Description of the models supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
State-space models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
The state-space model with fixed coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
The steady-state innovations model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Simple models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-4
The structural econometric model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The VARMAX model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The transfer function model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Composite models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Nested models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Nesting in inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Nesting in noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Component models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Models with GARCH errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Modeling options supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2-4
2-5
2-5
2-6
2-6
2-6
2-7
2-7
2-8
2-9

Defining models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
General ideas about model definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The THD format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
General rules about parameter matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Defining state-space and structural time series models . . . . . . . . . . . . . . . . . . . . . . . . .

3-1
3-1
3-1
3-2
3-3

Defining simple models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6


Defining nested models in inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
Defining component models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
Model estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
Modification of toolbox options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
Evaluation of the likelihood function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
Models with homoscedastic errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
Models with GARCH errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Initial conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Computation of the gradient and the information matrix . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Evaluation of the analytical gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5

Evaluation of the exact information matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6


Quasi-maximum likelihood estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Numerical optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
General use of e4min . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
Scaling problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
Preliminary estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
Displaying the estimation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10
Specification, forecasting, simulation and smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
Tools for time series analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
General purpose functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
Data transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
Tools for model specification and validation . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
User models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
Defining user models in the general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
Defining user models in reparametrized formulations . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
Univariate ARIMA examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
VARMA modeling: interaction between minks and muskrats . . . . . . . . . . . . . . . . . . . . 7-6
Transfer function analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
Unconstrained transfer function modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
Period estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
Estimation of a constrained transfer function . . . . . . . . . . . . . . . . . . . . . . . . . 7-15
Composite model forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16
Structural econometric models: supply and demand of food . . . . . . . . . . . . . . . . . . . . 7-18
Maximum-likelihood estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19
An ARCH model for the U.S. GNP deflator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
Estimation under homoscedasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
Estimation of an ARCH(8) process for the error . . . . . . . . . . . . . . . . . . . . . . 7-24
Estimation of a GARCH(1,1) process for the error . . . . . . . . . . . . . . . . . . . . 7-25
Forecasting and monitoring of objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-28
Disaggregation of value added in industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
Estimation of the high frequency data model . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
Disaggregation from nonstationary models . . . . . . . . . . . . . . . . . . . . . . . . . . 7-31
Disaggregation from a stationary model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-32
Models with observation errors: Wlfers sunspots data . . . . . . . . . . . . . . . . . . . . . . . 7-33

Univariate modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-33


Model with observation errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-35
Structural time series models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-38
Computation and modeling of unobservable components . . . . . . . . . . . . . . . . 7-39
Reference guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
aggrmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
arma2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
augdft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
comp2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-11
descser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-12
e4init . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-13
e4min . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-14
e4preest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17
e4trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19
fismiss, fismod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
foregarc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
foremiss, foremod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
garc2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ggarch, gmiss, gmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
histsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
igarch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
imiss, imod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
imodg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
lagser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
lffast, lfmiss, lfmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
lfgarch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
midents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8-24
8-26
8-27
8-28
8-30
8-32
8-33
8-35
8-37
8-39
8-40
8-43
8-45

nest2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-46
plotqqs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
plotsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-48
prtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
prtmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-51
residual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-52
rmedser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-54
sete4opt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-56
simgarch, simmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-58
ss_dv, garch_dv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-60
ss_dvp, garc_dvp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-61
ss2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-62
stackthd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-64
str2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-65

tf2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-67
thd2arma, thd2str, thd2tf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-69
thd2ss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-71
tomod, touser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-72
transdif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-73
uidents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-74
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
Error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Warning messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2

1 Introduction
This book describes a MATLAB Toolbox for econometric modeling of time series. Its name, E4,
refers to the Spanish Estimacin de modelos Economtricos en Espacio de los Estados, meaning
State-Space Estimation of Econometric Models.
E4 makes up for the lack of PC software for estimating econometric models by exact maximum
likelihood. The main model supported is a general state-space (SS) form with fixed-parameters. On
this basis, the Toolbox supports also many standard formulations such as VARMAX (Vector
AutoRegressive Moving Average with eXogenous variables), structural econometric models and
single-output transfer functions. All these models can be estimated: a) by themselves or in composite
formulations, b) unconstrained or subject to linear and/or nonlinear constraints on the parameters,
and c) under standard conditions (i.e.. with full information and homoscedastic errors) or in an
extended framework that allows for observation errors, missing data and vector GARCH
(Generalized AutoRegressive Conditional Heteroscedastic) errors.
This flexibility is obtained by treating each model internally in its equivalent state-space
formulation, which allows certain computations and analyses that would not otherwise be possible.
In most cases, however, the user does not need to understand nor handle the SS formulation, as the
library includes many interface functions that manage the necessary conversions from/to the
conventional representation of the model.
From a theoretical point of view, the librarys most relevant characteristics are the following:
&

The main estimation criterion is exact maximum likelihood. Likelihood evaluation can be done
using the filters of Kalman or Chandrasekhar.

&

There are several algorithms to compute initial conditions for the state of a stationary or
nonstationary system.

&

The toolbox includes a subspace-based consistent estimation criterion. This function is very fast.
Hence its estimates can be used as final values when analyzing large samples or as starting
values for maximum likelihood estimation.

&KDS  3DJ 

&

The Toolbox includes functions to compute the analytical gradient of the likelihood functions and
the exact information matrix. These functions enhance the likelihood optimization process and
the ex-post validation of the model, through hypotheses testing.

&

Whereas the emphasis is in model estimation, the Toolbox includes many functions for model
specification and validation, simulation and forecasting.

&

The functions related with forecasting and fixed-interval smoothing are based in efficient and
state-of-the-art methods.

&

The use of MATLAB as development platform also guarantees the reliability of computations, as
it is based on the results of the well-known and reputable LINPACK and EISPACK projects.
Also, it allows the user to easily extend the formulations and methods supported.

The focus of E4 on SS econometrics has some pros and cons. Specifically, the routines have been
optimized for numerical accuracy and robustness, not for speed. This is not a serious inconvenience,
as econometric modeling seldom requires real-time performance. Speedier computations can be
obtained easily by tuning some system parameters, or with some effort by translating critical
functions to a lower level language, such as C, or generating MEX files from the source code, see
MATLAB (1992) and MATLAB (1996). On the other hand, this emphasis on accuracy and
robustness has many advantages, see McCullough and Vinod (1999).

How the toolbox works?


To unify the treatment of a wide range of econometric models, the toolbox uses an internal format,
known as THD (THeta-Din) format, which stores the information about the dynamic and stochastic
structure of the models. This format is managed by user-friendly interface functions.
For example, if one wants to estimate a VARMA model, the first step consists of writing its
conventional formulation and calling the function that returns the corresponding THD format. The
rest of the toolbox functions assume the model to be codified in this format. A detailed description of
the THD format can be found in the Appendix C.
Many functions in the Toolbox require, as input arguments, a THD format definition and a data
matrix. These functions transform the model to SS representation and start the required computation
process. Therefore, most users do not need to know the SS formulation of the model.
The steps in the normal operation of E4 are the following:

&KDS  3DJ 

&

Read and transform the data (Box-Cox, differencing or any other). In this stage, graphics and
functions for model specification can also be used. If simulations are required, there are functions
(e.g., simmod) that can generate the data.

&

Define the model to be estimated by writing its parameter matrices and obtain the corresponding
representation in THD format. This is done with a specific function for each type of model.

&

Call the toolbox optimization algorithm (e4min) to estimate the model by maximizing the exact
likelihood function. Before that, the user may have to define some parameters that control the
behaviour of the optimization algorithm.

&

Once the estimates are obtained, use the appropriate function for computing the exact
information matrix. An alternative estimate of the information matrix can be obtained directly
from e4min, through the hessian inverse. Finally, display the estimation results in a legible
format.

&

A wide range of possibilities are opened up with the results from the previous operations, like
validation, forecasting, interpolation of objectives, re-estimation of the model under linear or nonlinear constraints or outlier elimination.

Of course, it is not necessary to go through all the previous steps nor to always follow the same
sequence, as this will depend on the users objectives.

Installing the toolbox


The distribution diskette contains the source code of the toolbox in the directory \E4. Assuming that
MATLAB is installed in the directory C:\MATLAB, the contents of A:\E4 should be copied to
C:\MATLAB\TOOLBOX\E4. If the distribution diskette is inserted into the A: drive, the sequence
of MS-DOS commands would be as follows:
C:\>cd c:\matlab\toolbox
C:\MATLAB\TOOLBOX>md e4
C:\MATLAB\TOOLBOX>copy a:\e4\*.* c:\matlab\toolbox\e4

Once the files are copied, the MATLABRC.M file must be modified to include the directory
c:\matlab\toolbox\e4 in the search path of MATLAB functions. To do this with a text editor,
such as Windows NOTEPAD, open C:\MATLAB\MATLABRC.M and find the command
matlabpath, which will be similar to:
matlabpath([...
&KDS  3DJ 

'C:\MATLAB;',...
'C:\MATLAB\toolbox\matlab\general;',...
...
'C:\MATLAB\toolbox\matlab\plotxy;',...
]);

The user should add the path to the directory containing the library, obtaining something similar to:
matlabpath([...
'C:\MATLAB;',...
'C:\MATLAB\toolbox\matlab\general;',...
...
'C:\MATLAB\toolbox\matlab\plotxy;',...
'C:\MATLAB\toolbox\e4;',...
]);

Last, before saving this file, add the following line to the end of MATLABRC.M:
e4init;

This command initializes the toolbox options. Although the call to e4init can be done at the
beginning of each toolbox session, it is easier to initialize when starting up MATLAB. Bear in mind
that the toolbox does not work properly if this function is not run.
Once these steps have been completed, MATLAB is able to use the E4 library.
The .m files corresponding to the examples in Chapter 7 are in the directory \EXAMPLES of the
distribution diskette. If the distribution diskette is inserted into the A: drive, they can be copied to the
directory C:\EXAMPLES (or any other) of the hard disk with the following commands:
C:\>md examples
C:\>xcopy a:\examples c:\examples /s

Contents of the book


This book is organized as follows. Chapter 2 presents the econometric models supported by E4.
Chapter 3 describes the functions managing the different representations of a model. It begins by
describing the functions that transform conventional models to THD format. Building on this format,
the models can be translated to the SS or conventional formulations.
Chapter 4 is about model estimation. It describes several functions related with likelihood
evaluation, optimization and calculation of the exact information matrix, as well as the functions
that manage the toolbox options and tolerances.

&KDS  3DJ 

Chapter 5 deals with the toolbox functions for model specification, validation, forecasting and
smoothing.
Chapter 6 is about the extension of the formulations supported by means of user model. This option
allows the user to define new formulations or to work with nonstandard parametrizations of the
models supported.
To illustrate how the toolbox works, Chapter 7 includes several case studies that cover both,
introductory and advanced time series analysis with E4.
Finally, Chapter 8 is the Reference Guide to the toolbox functions and Chapter 9 contains the
bibliographic references. The Appendices A to C describe the error and warning messages, the
structure of the internal vector E4OPTION and the THD format, respectively.

&KDS  3DJ 

2 Description of the models supported


E4 addresses a wide set of econometric models in all stages of analysis: specification, estimation,
forecasting, smoothing and simulation. This Chapter presents the mathematical formulation of these
models and is divided in five Sections.
The first Section describes the structure of the fixed-parameters SS models, which are used
internally by E4 for most computational purposes.
The second Section is devoted to the basic econometric models supported in the toolbox: the
structural econometric model, the VARMAX model and the single-output transfer function. The
third Section defines a very flexible option, called composite models, which can be used to
combine several SS formulations in a single model.
The fourth Section describes the formulation of models with disturbances conditionally
heteroscedastic and the last Section presents the combination of modeling options supported and
introduces the concept of user models, which is discussed in Chapter 6.

State-space models
7KH VWDWHVSDFH PRGHO ZLWK IL[HG FRHIILFLHQWV

The most general SS formulation supported by E4 is:


xt  1
zt

0 x t 
ut  E wt

H x t  D u t  C vt

(2.1)
(2.2)

where:
x t is an ( n 1 ) vector of state variables,
u t is an ( r 1 ) vector of exogenous variables,
z t is an ( m 1 ) vector of observable variables,
w t and v t are white noise processes such that:
E [ wt ]
&KDS  3DJ 

0,

E [ vt ]

(2.3)

wt
E

vt

wt 1 vt2

Q S
ST R

t 1t 2

(2.4)

being Q and R positive semi-definite matrices. See Terceiro (1990).


Example 2.1 (SS representation of standard econometric models). Any linear econometric model
with fixed parameters can be represented in the SS form (2.1)-(2.2). Therefore, any numerical or
statistical procedure (e.g., forecasting) developed for the SS model can be applied to all the
particular cases. This is the basic idea implemented in E4.
For example, consider the ARMA(1,1) model: zt  1
1 zt  at  1   at . An equivalent
representation in SS form is:
xt  1
1 xt  (  1 ) at
z t
xt  a t
The SS representation of a given model is not unique, e.g., the previous ARMA(1,1) model can also
be written in SS form as:
1

xt  1
2

xt  1

1
1  xt

0 x2
t

1
1

at  1

zt
1 0

xt

xt

For further details about the SS representation of econometric models, see Terceiro (1990).
Example 2.2 (SS and structural time series models). The additive structural decomposition of a
time series, z t , is defined by:
z t
tt  ct  st  Jt
where:
t t is the trend component, representing the long-term behavior of the series,
c t is the transitory component, or cycle, describing short-term fluctuations,
s t is the seasonal component, associated to persistent variability patterns repeated along a
season, and

Jt is an irregular component.
&KDS  3DJ 

An structural time series model is directly set up in terms of these components, which are
represented by SS models specified according to the properties of the time series. For example, the
following formulation describes a series in terms of an stochastic trend and quarterly dummy
seasonality:

tt  1

t  1
st  1

1 1

0 1

tt

0 0
1 0

0 0 1 1 1 st  0 1

st

0 0

st 1

0 0

tt 1

0 0

tt 2

0 0

t
7t

tt

t
zt
1 0 1 0 0

st

 Jt

st 1
tt 2
where the error terms t , 7t and
instantaneous covariance matrix:

Jt are assumed to be gaussian white noise processes, with an

t )2 0 0
7t
0 )27 0
2
Jt
0 0 )J

This class of models is supported by E4 through the use of the formulation (2.1)-(2.2). For more
details about these formulations, see Harvey (1989).

7KH VWHDG\VWDWH LQQRYDWLRQV PRGHO

A particular case of (2.1)-(2.2) is the steady-state innovations SS model, see Anderson and Moore
(1979), defined by:
xt1
zt

0 x t 
ut  E Jt

H xt  D ut  Jt

&KDS  3DJ 

(2.5)
(2.6)

Comparing (2.5)-(2.6) with (2.1)-(2.2) it is immediate to see that, in this formulation, the errors in
the state and observation equations are the same and C
I . The relevance of this special case lies in
two facts: a) many econometric models in SS have the steady-state innovations structure, see e.g.,
the first SS representation in Example 2.1, and b) when applied to a SS model with this structure,
the forecasting, filtering and smoothing algorithms have special convergence properties, which allow
the implementation of very efficient and stable computational procedures, see Casals, Sotoca and
Jerez (1999) and Casals, Jerez and Sotoca (2000).
Whenever suitable, E4 takes advantage of these properties.

Simple models
Besides state-space models, E4 manages three basic formulations, which are known as simple
models. These are structural econometric models, VARMAX models and single-output transfer
functions. The last two formulations can also be combined with a multivariate GARCH model for
the conditional variances of the errors.

7KH VWUXFWXUDO HFRQRPHWULF PRGHO


A structural econometric model can be formulated as:
FR ( B ) FS ( B S ) y t
ZKHUH 6

G ( B ) ut  AR ( B ) AS ( B S ) Jt



% LV WKH EDFNVKLIW RSHUDWRU VXFK WKDW IRU DQ\


y t LV D P YHFWRU RI HQGRJHQRXV YDULDEOHV u t LV D U

GHQRWHV WKH OHQJWK RI WKH VHDVRQDO SHULRG

VHTXHQFH

x t  B k xt

xt.k  DQG

YHFWRU RI H[RJHQRXV YDULDEOHV

Jt

P YHFWRU RI ZKLWH QRLVH HUURUV DQG

LV D

FR ( B )
FR0  FR1 B  ...  FR p B p
FS ( B S )
FS0  FS1 B S  ...  FS P B PS
G ( B )
G0  G1 B  ...  G g B g
AR ( B )
AR0  AR1 B  ...  AR q B q
AS ( B )

AS0  AS1 B S  ...  AS Q B QS

The characteristic feature of this formulation consists of allowing for a contemporary relationship
between the endogenous variables, given by WKH PDWULFHV FR0 DQG FS0 . To normalize the model,
the elements in its main diagonal should be equal to one. This formulation includes, as particular
cases, the linear regression model and the simultaneous equations model.

&KDS  3DJ 

7KH 9$50$; PRGHO


The VARMAX model is defined by:
FR ( B ) FS ( B S ) y t

G ( B ) ut  AR ( B ) AS ( B S ) Jt

(2.8)

where y t , u t , Jt are defined in (2.7) and:


FR ( B )
I  FR1 B  ...  FR p B p
FS ( B S )
I  FS1 B S  ...  FS P B PS
G ( B )
G0  G1 B  ...  G n B n
AR ( B )
I  AR1 B  ...  AR q B q
AS ( B )
I  AS1 B S  ...  AS Q B QS
Important and frequent particular cases of this formulation are the univariate ARMA and ARMAX
models and the VARMA model.

7KH WUDQVIHU IXQFWLRQ PRGHO


The third basic specification is the single-output transfer function model, which can be formulated
as:

yt

71(B)
1(B)

u1t

7r(B)

 ... 

r(B)

urt

S
 (B) (B S)

1(B) 0(B )

Jt

(2.9)
where:
y t is the value of the endogenous variables at time t,
u t
[ u1 t , , urt]T is a (r1) vector of exogenous variables,
Jt is a white noise error and
n
7i (B)
7i0  7i1 B  7i2 B 2   7i n B i ; i
1 , 2 , , r
i
nd
i (B)
1  i1 B   i nd B i ; i
1 , 2 , , r
i

1(B)
1  11 B   1p B p
0(B S)
1  01 B S   0P B PS
(B)
1  1 B   q B q
( B S )
1  1 B S   Q B QS

&KDS  3DJ 

Composite models
In the context of E4, a model that combines the formulation of several models is called a composite
model. E4 allows for two types of model composition: nested models and component models.
A model is said to be nested if it is constructed by a combination of multiplicative factors. For
example, the famous airline model can be viewed as the result of nesting a regular IMA(1,1) process
with a seasonal IMA(1,1) process.
On the other hand, a component model is obtained by the addition of several dynamic structures. For
example, one could define the model for a trend-cycle decomposition by adding an ARIMA model
for the trend and an ARIMA model for the cycle.

1HVWHG PRGHOV
E4 allows for two types of model nesting: nesting in inputs and nesting in noise.

1HVWLQJ LQ LQSXWV
A model is said to be nested in inputs if some exogenous variable is substituted by a model
describing its dynamic and stochastic structure. After doing so, the exogenous variables become
endogenous and, therefore, we can also describe this operation as endogeneization. The following
example illustrates in a general SS framework..
Example 2.3 (Endogeneization). Assume that the model for a vector of endogenous variables, y t ,
is:
a

xt  1

0a x ta 
a u t  E a wt a

H a x t a  D a u t  C a vt a

yt

(2.10)
(2.11)

whereas the exogenous variables, u t , follow the model:


b

xt  1
ut

0b x tb  E b w tb

H b x t b  C b vt b

(2.12)
(2.13)

being the errors in (2.10)-(2.11) and (2.12)-(2.13) mutually independent. Substituting (2.13) in
(2.10)-(2.11) yields:

&KDS  3DJ 


0a x ta 
a ( H b x tb  C b vtb )  E a w ta

xt  1
yt

H a x t a  D a ( H b x t b  C b vt b )  C a vt a

(2.14)
(2.15)

The Eqs. (2.14) and (2.12) can be easily combined in a single state equation:

xt1
b

xt1

wt

E a
a C b 0
0a
a H b xt
b


vt
b
b
b
E
0
0
0
0
xt
wt

(2.16)

and Eqs. (2.15) and (2.13) can be combined in a single observation equation:

yt
ut

H a D a H b xt
0

Hb

xt

a
b

C a D a C b vt
0

Cb

vt

a
b

(2.17)

1HVWLQJ LQ QRLVH
Nesting in errors consists of defining the noise structure of an econometric model as a multiplicative
combination of several dynamic factors. The only requisite for these factors is that their SS
equivalent representation should be an innovation model, see (2.5)-(2.6). This is not a severe
restriction, as most simple models supported by E4 satisfy it. Two relevant applications of this type
of nesting consist of: a) separating unit roots from stationary or invertible factors and b)
representing a time series with multiple seasonal cycles.

&RPSRQHQW PRGHOV
A component model is defined as the sum of several components, having all of them a (simple or
composite) model describing its particular dynamic and stochastic features. Two important cases of
component models are Structural Time Series Models (STSM), see Harvey (1989) and models with
observation errors, see Terceiro (1990).
Example 2.4 (Observation errors). Assume that a vector y t in model (2.10)-(2.11) is such as:

y t

yt  vt y

(2.18)


where y t is observable and v t is white noise observation error, independent from the disturbances
y

in (2.10)-(2.11). Combining (2.18) with (2.11) yields the new observation equation:

y t

H a x t a  D a u t  C a vt a  vt y

&KDS  3DJ 

(2.19)

which relates the states in (2.10) with the observable variables.

Models with GARCH errors


Most econometric models assume that errors have constant conditional variances. To generalize this
assumption, Engle (1982) introduced a class of stochastic processes with time-varying conditional
variances. For a comprehensive survey, see Bollerslev et al. (1994).
E4 allows one to combine any VARMAX model, transfer function or SS in the steady-state
innovations form (2.5)-(2.6), with a vector GARCH process for the error Jt . This process is defined
by the unconditional moments E [Jt ]
0 , V [ Jt ]
(J ; and the conditional moments: Et 1[Jt ]
0 ,

Et 1[Jt Jt ]
( t , where Et 1[ ] denotes the expectation of the argument conditional to the information
available in t-1. In a GARCH model, the conditional variances, (t , are such that:
T

[ I  ( B ) ] vech( ( t )
vech( w )   ( B ) vech( Jt Jt )
T

(2.20)

where vech( ) stands for the vector-half operator, which stacks the lower triangle of a NN matrix
as a [ N ( N  1 ) / 2 ]1 vector; and the polynomial matrices are given by:

 ( B )
M i B i
sJ

i
1
pJ

 ( B )
M i B i .
i
1

E4 manages model (2.20) in its alternative VARMAX representation. To derive it, consider the
T
T
process v t
vech(JtJt ) vech( ( t) , such that E t 1[ vt ]
0 . Then vech( ( t)
vech(JtJt ) vt .
Substituting this expression in (2.20) and rearranging some terms yields:
[ I  ( B )  ( B ) ] vech( Jt JTt )
vech( w )  [ I  ( B ) ] v t

(2.21)

Eq. (2.21) defines a VARMAX model for vech( Jt Jt ) , which can be written in compact form as:
T

vech( Jt Jt )
vech( *J )  N t

(2.22)

(B)] N
[I  
(B)]v
[I  
t
t

(2.23)

( B )
 ( B ) . This is the formulation supported in E4.
( B )
[  ( B )   (B ) ] and 
where 
Note that:

&KDS  3DJ 

1) If the VAR polynomial in (2.23) has roots on the unit radius circle, then the process has some
IGARCH (Integrated-GARCH) components.
2) The formulation (2.22)-(2.23) does not assure the eigenvalues of

( t to be non-negative for all t.

Example 2.5 (Formulation of a GARCH(1,1) process as an ARMA(1,1)). Consider the process


2
2
2
Jt  IID N ( 0 , )J ) ; Jt 6t 1  IID N ( 0 , ht ) , such that the conditional variance, h t , follows a
GARCH(1,1) equation:
2

ht

7  1 J2t 1  1 ht2 1

Defining v t r Jt

(1

ht2 it follows that the previous equation can be rewritten as:

1 B 1 B ) J2t
7  ( 1 1 B ) vt

which is analogous to (2.21), or:

J2t
)2J  n t
[1

( 1  1 ) B ] n t
( 1 1 B ) vt

which is analogous to (2.22)-(2.23). An IGARCH model can be formulated by imposing


1  1
1 .

Modeling options supported


The following table summarizes the different options supported by E4, for all the simple
formulations defined in this section:
Support for:

Missing

Stationarity/

GARCH

Simple models

data

Nonstationarity

errors

VARMAX

YES

YES

YES

Structural econometric

YES

YES

NO

Single-output transfer functions

YES

YES

YES

State-Space (including structural time


series models)

YES

YES

YES

Only for models in steady-state innovations form


&KDS  3DJ 

Besides these formulations, E4 supports the definition of user models. This feature allows the user
to:
1) define any econometric model not supported directly by E4, providing that it has an equivalent SS
representation,
2) impose general nonlinear equality constraints on the parameters of the models supported, and
3) formulate a nonstandard parametrization of the models supported, see Chapter 6.
Chapter 6 describes in detail this option.

&KDS  3DJ 

3 Defining models
E4 uses internally the SS representation (2.1)-(2.2) for most computations. However, its basic
representation is the THD format. This Chapter describes the functions that translate any of the
models described in Chapter 2 into THD format and, conversely, those that translate a THD
representation into the standard notation.
The first Section introduces the THD format and the general rules about definition of parameter
matrices. The second, third and fourth Sections describe the functions that translate the matrices of a
SS model, a simple model or a composite model, respectively, into THD format. Definition of
models with conditional heteroscedastic errors is discussed in Section fifth. Finally, Sections sixth
and seventh explain how to translate a model in THD format into a SS or a conventional
representation.

General ideas about model definition

7KH 7+' IRUPDW


The basic format for model representation in E4 is called THD format. Any THD format
specification is composed of two matrices: theta and din, which contain, respectively, the values
of the model parameters and a description of its dynamic structure. Besides of theta and din, a
model can be documented by an optional character matrix lab, which contains names for the
parameters in theta.
The matrix theta has a first column containing the values of all the parameters in a model.
Optionally, it may have a second column whose values are either zero, to indicate that the
corresponding parameter in the first column is free, or one, when the parameter is constrained to
remain at its present value.
While some analyses require the modification of theta and lab, most users will never need to
handle the matrix din. Anyway Appendix C contains a detailed description of the THD format.
In E4, defining a model consists of creating its parameter matrices by means of MATLAB
commands and feeding these matrices to an E4 interface function, which generates theta, din and
&KDS  3DJ 

lab. The use of these interface functions is reviewed in the rest of this Chapter and discussed in

detail in Chapter 8.
After defining a model in THD format, its structure can be displayed by the function prtmod, which
has the syntax:
prtmod(theta, din, lab);

*HQHUDO UXOHV DERXW SDUDPHWHU PDWULFHV


When creating the parameter matrices, one should take into account the following general rules:
1) The value NaN marks those positions where the corresponding parameter is constrained to zero.
2) When all the parameters in a matrix are null, it should be defined as empty ([]).
3) A matrix of covariances can be defined as a vector. In this case, it is assumed to be diagonal and
the elements in the main diagonal are automatically set to the values in the vector. In order not to
impose this constraint, it is necessary to define at least its lower triangle.
4) A covariance matrix cannot contain the value NaN. To impose independence between two
specific errors, the user should specify a zero value in the first column of theta and, afterwards,
add a fixed-parameter constraint in the second column.
Example 3.1 (Defining matrices of parameters).
The following MATLAB commands initialize five matrices, which could be used to define a model
in THD format:
A=[-.8 NaN;.1 0];
Sigma1=[1 .1; .1 2];
Sigma2=[1 2; .1 2];
Sigma3=[1;2];
Sigma4=[1 NaN;NaN 2];

The first command generates the matrix:


A

.8 0
.1

where the parameter in the position (1,2) is constrained to remain in its present null value because it
has been defined using NaN. The zero element in the position (2,2) defines a parameter with a null
starting value, which will be allowed to change during the estimation process.

&KDS  3DJ 

The second and third commands define different matrices but, when interpreted as covariance
matrices by an E4 function, they are equivalent.

1 .1

(1
(2

.1 2

because the upper triangle is in fact ignored. The fourth command defines the covariance matrix:

(3

1 0
0 2

where the off-diagonal elements are constrained to remain at its present null values. Finally, the last
command is a valid MATLAB sentence but, when interpreted as a covariance matrix by an E4
function, will generate an error message due to the presence of NaN.

Defining state-space and structural time series models


The function ss2thd obtains the representation of the SS model (2.1)-(2.2) in THD format. Its
syntax is:
[theta, din, lab] = ss2thd(Phi, Gam, E, H, D, C, Q, S, R);

where the input arguments correspond to the parameter matrices in the standard representation (2.1)(2.2). See Chapter 8.
Example 3.2 (Defining an error correction model). Consider the following model in error
correction form:

yt
.5 wt  .7 yt 1  .4 ut 
wt

at

.8 at 1 ; )2a
.1

yt .3 ut

which does not correspond to any of the formulations defined in Chapter 2. Its SS formulation is:

xt1
wt1

&KDS  3DJ 

.7
0

.35 xt
0

wt

ut


.28
0

.3 1

ut1
yt1

.1
0

at

yt

.5

xt
wt

ut
 .4 0 0 ut1  at
yt1

and the corresponding THD representation can be obtained with the following code:
Phi=[.7, -.35; NaN, NaN];
Gamma=[.28, NaN, NaN; NaN, -.3, 1];
E=[-.1; NaN];
H=[1, -.5];
D=[.4, NaN, NaN];
[theta, din, lab] = ss2thd(Phi, Gamma, E, H, D, [1], [.1], [.1], [.1]);
prtmod(theta, din, lab);

and the output of prtmod is:


*************************** Model ***************************
Native SS model
1 endogenous v., 3 exogenous v.
Seasonality: 1
SS vector dimension: 2
Parameters (* denotes constrained parameter):
PHI(1,1)
0.7000
PHI(1,2)
-0.3500
GAMMA(1,1)
0.2800
GAMMA(2,2)
-0.3000
GAMMA(2,3)
1.0000
E(1,1)
-0.1000
H(1,1)
1.0000
H(1,2)
-0.5000
D(1,1)
0.4000
C(1,1)
1.0000
Q(1,1)
0.1000
S(1,1)
0.1000
R(1,1)
0.1000
*************************************************************

Note that the elements of the matrices in the SS representation are nonlinear functions of the
parameters in the original formulation. Obtaining estimates for the original parameters would
require a user model, see Chapter 6.
Example 3.3 (Defining structural time series models). Consider the decomposition of a time
series, y t , into a trend, T t , and irregular component, Jt , such that:
yt

Tt 

Jt ; T t

Tt 1  t 1 ; t
t 1  t 1

where t is the change of the trend at time t, and the errors Jt and t 1 are independent white noise
processes with variances )2
100 and )2
.001 , respectively. To define this model with E4 it is
first necessary to obtain its SS representation:

&KDS  3DJ 

Tt1

t1
yt

1 1 Tt
0 1

[ 1 0]

Tt

0
1

t

 Jt

and then it can be defined and listed with the commands:


[theta,din,lab] = ss2thd([1 1; 0 1],[],[0;1], ...
[1 0],[],[1],[.001],[0],[100]);
prtmod(theta, din, lab);

which generate the following output:


*************************** Model ***************************
Native SS model
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 2
Parameters (* denotes constrained parameter):
PHI(1,1)
1.0000
PHI(2,1)
0.0000
PHI(1,2)
1.0000
PHI(2,2)
1.0000
E(1,1)
0.0000
E(2,1)
1.0000
H(1,1)
1.0000
H(1,2)
0.0000
C(1,1)
1.0000
Q(1,1)
0.0010
S(1,1)
0.0000
R(1,1)
100.0000
*************************************************************

If the model is to be estimated, all the parameters except the variances should keep its present
values. These fixed-value constraints can be imposed with the additional commands:
theta=[theta ones(12,1)]; theta(10,2)=0; theta(12,2)=0;
prtmod(theta,din,lab);

and the resulting output is:


*************************** Model ***************************
Native SS model
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 2
Parameters (* denotes constrained parameter):
PHI(1,1)
*
1.0000
PHI(2,1)
*
0.0000
PHI(1,2)
*
1.0000
PHI(2,2)
*
1.0000
E(1,1)
*
0.0000
E(2,1)
*
1.0000
H(1,1)
*
1.0000
H(1,2)
*
0.0000
C(1,1)
*
1.0000
Q(1,1)
0.0010
S(1,1)
*
0.0000
R(1,1)
100.0000
*************************************************************
&KDS  3DJ 

where the parameters constrained to its initial values are marked with an asterisk.

Defining simple models


The functions str2thd, arma2thd and tf2thd obtain the THD specification for structural
econometric models, VARMAX models and transfer functions, respectively. Their synopses are:
[theta,din,lab] = str2thd([FR0 ... FRp],[FS0 ... FSps], ...
[AR0 ... ARq], [AS0 ... ASqs],v,s,[G0 ... Gg],r)
[theta,din,lab] = arma2thd([FR1 ... FRp],[FS1 ... FSps], ...
[AR1 ... ARq],[AS1 ... ASqs],v,s,[G0 ... Gn],r)
[theta,din,lab] = tf2thd([fr1 ... frp], [fs1 ... fsps], ...
[ar1 ... arq],[as1 ... asqs],v,s,[w1; ...; wr],[d1; ...; dr])

See Chapter 8 for a detailed description of the input arguments.


Example 3.4 (Defining structural econometric models). Consider the model:
1

.3

.7 1

.4 0
.5

y1t
y2t

.3 0

u1t

0 .5

u2t

J1t
J2t

J1t

J2t

1 0
0 .9

The first step to obtain its definition in THD format consist of defining the input arguments to
str2thd. This can be done with the following MATLAB commands:
FR0
FR1
G0
v

=
=
=
=

[ 1 -.3; -.7 1 ];
[-.4 NaN; .5 NaN];
[ .3 NaN; NaN .5];
[1.0 .9];

and afterwards, the THD form is obtained and displayed with:


[theta, din, lab] = str2thd([FR0 FR1],[],[],[],v,1,[G0],2);
prtmod(theta, din, lab);

These commands generate the following output:

&KDS  3DJ 

*************************** Model ***************************


Structural model (innovations model)
2 endogenous v., 2 exogenous v.
Seasonality: 1
SS vector dimension: 2
Parameters (* denotes constrained parameter):
FR0(1,1)
1.0000
FR0(2,1)
-0.7000
FR0(1,2)
-0.3000
FR0(2,2)
1.0000
FR1(1,1)
-0.4000
FR1(2,1)
0.5000
G0(1,1)
0.3000
G0(2,2)
0.5000
V(1,1)
1.0000
V(2,2)
0.9000
*************************************************************

Example 3.5 (Defining VARMAX models): The model:


1 0
0 1

.8 0
.45 0 2
B
B
.5 .7
0 0
1 0
0 1

.4 0
B at
0 .3

1 0
0 1

V(at)

.2 0
B 4 y t

0 .3
.9 0
0 .9

Can be defined in THD format and displayed with the following MATLAB commands:
FR1 = [-.8 NaN; -.5 -.7];
FR2 = [-.45 NaN; NaN NaN];
FS1 = [-.2 NaN; NaN -.3];
AR1 = [-.4 NaN; NaN -.3];
v
= [.9 .9];
[theta, din, lab] = arma2thd([FR1 FR2],[FS1],[AR1],[],v,4);
prtmod(theta, din, lab);

Note that an empty matrix, [], should appear where the model does not include the corresponding
structure. In this case there was no seasonal moving average factor. Finally, the output displayed by
prtmod is the following:
*************************** Model ***************************
VARMAX model (innovations model)
2 endogenous v., 0 exogenous v.
Seasonality: 4
SS vector dimension: 12
Parameters (* denotes constrained parameter):
FR1(1,1)
-0.8000
FR1(2,1)
-0.5000
FR1(2,2)
-0.7000
FR2(1,1)
-0.4500
FS1(1,1)
-0.2000
FS1(2,2)
-0.3000
AR1(1,1)
-0.4000
AR1(2,2)
-0.3000
V(1,1)
0.9000
V(2,2)
0.9000
*************************************************************

&KDS  3DJ 

Example 3.6 (Defining transfer functions). Given the transfer function:

yt

.3  .6B
.3B
1 .8B
u 
u  (.3B  .4B 2) u2t 
2 3t
1 .5B 1t
1 .6B
1 .1B .2B

Jt ; )2J
1

its definition in THD format is obtained as follows:


w1 = [ .3 .6 NaN]; d1 = [-.5 NaN];
w2 = [NaN .3 .4 ]; d2 = [NaN NaN];
w3 = [NaN .3 NaN]; d3 = [-.1 -.2];
fr = [-.6]; ar = [-.8];
v = [1.0];
[theta,din,lab]=tf2thd(fr,[],ar,[],v,1,[w1;w2;w3],[d1;d2;d3]);
prtmod(theta,din,lab);

and the corresponding prtmod output is:


*************************** Model ***************************
Transfer function model (innovations model)
1 endogenous v., 3 exogenous v.
Seasonality: 1
SS vector dimension: 5
Parameters (* denotes constrained parameter):
FR(1,1)
-0.6000
AR(1,1)
-0.8000
W1(1,1)
0.3000
W1(2,1)
0.6000
W2(2,1)
0.3000
W2(3,1)
0.4000
W3(2,1)
0.3000
D1(1,1)
-0.5000
D3(1,1)
-0.1000
D3(2,1)
-0.2000
V(1,1)
1.0000
*************************************************************

Defining composite models

'HILQLQJ QHVWHG PRGHOV LQ LQSXWV


To obtain in E4 the THD formulation of a model nested in inputs, one should follow a three step
procedure:
Step 1)

Obtain the THD formulation of the models for the endogenous and exogenous
variables, using functions as arma2thd y tf2thd.

Step 2)

Feed the THD forms obtained to the function stackthd, which arranges the individual
THD formats into a single (stacked) THD description. The syntax is:

&KDS  3DJ 

[theta, din, lab] = stackthd(t1, d1, t2, d2, l1, l2);

where the inputs arguments t1, d1, l1 and t2, d2, l2 are the THD forms obtained
in Step 1). If necessary, the output arguments can be feed again to stackthd, to continue
recursively the stacking process.
Step 3) Translate the stacked model to the final nested formulation using the function nest2thd,
which syntax is:
[theta, din, lab] = nest2thd(theta, din, nestwhat, lab);

where the input arguments theta, din, lab were the final results of Step 2), and
nestwhat is a binary flag to choose between nesting in inputs (if nestwhat is equal to
one) or in errors (if nestwhat is equal to zero).
Example 3.7 (Endogeneization of the exogenous variable in a transfer function). Given the
transfer function:

yt

.3  .6B
1 .8B
2
u1t 
Jt ; )J
1
1 .5B
1 .6B

2
where u1t is such that ( 1 .7 B ) u1t
a t ; )a
.3 . In a standard analytic framework, obtaining
forecasts for y t requires first to forecast the exogenous variable and afterwards feed these forecasts

to the model. An endogeneized model is an effective way to: a) perform both steps as a single
operation and b) taking into account the uncertainty affecting the forecasts for the input, which are
often ignored . The code required to follow steps 1) to 3) is:
Step 1) Obtain the THD representation for both models:
w1 = [ .3 .6]; d1 = [-.5];
fr = [-.6];
ar = [-.8];
v = [1.0];
[t1,d1,l1]=tf2thd(fr,[],ar,[],v,1,[w1],[d1]);
[t2,d2,l2]=arma2thd(-.7,[],[],[],.3,1);

Step 2) Combine the model for the endogenous variable with the model for the input into a single
stacked THD representation:
[theta, din, lab] = stackthd(t1, d1, t2, d2, l1, l2);

Step 3) Translate the stacked model to the final nested formulation and display a description of its
structure:

&KDS  3DJ 

[theta, din, lab] = nest2thd(theta, din, 1, lab);


prtmod(theta,din,lab);

The output from prtmod is:


*************************** Model ***************************
Nested model in inputs (innovations model)
2 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 3
Submodels:
{
Transfer function model (innovations model)
1 endogenous v., 1 exogenous v.
Seasonality: 1
SS vector dimension: 2
Parameters (* denotes constrained parameter):
FR(1,1)
-0.6000
AR(1,1)
-0.8000
W1(1,1)
0.3000
W1(2,1)
0.6000
D1(1,1)
-0.5000
V(1,1)
1.0000
-------------VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 1
Parameters (* denotes constrained parameter):
FR1(1,1)
-0.7000
V(1,1)
0.3000
-------------}
*************************************************************

Note that the resulting model has two endogenous variables and no exogenous variable.
Example 3.8 (Unit roots). Assume the following ARIMA(1,1,0) model:
(1
where

.7 B ) / z t
at ; )2a
.2
/  1 B . The standard procedure to deal with non-stationary processes like this consists

of eliminating the unit root by differencing the time series. For some applications - e.g., forecasting
the level of time series, z t , or interpolating missing values- it is more convenient to work directly
with the factored AR(2) model:
(1

.7 B ) ( 1 B ) z t
at

The THD format of this model is obtained and displayed with the following code:
[t1, d1, l1] =
[t2, d2, l2] =
[ts, ds, ls] =
[tn, dn, ln] =
prtmod(tn, dn,

arma2thd([-1], [], [], [], 1, 1);


arma2thd([-.7], [], [], [], .2, 1);
stackthd(t1, d1, t2, d2, l1, l2);
nest2thd(ts, ds, 0, ls);
ln);

&KDS  3DJ 

and the corresponding prtmod output is:


*************************** Model ***************************
Nested model in errors (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 2
Submodels:
{
VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 1
Parameters (* denotes constrained parameter):
FR1(1,1)
-1.0000
-------------VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 1
Parameters (* denotes constrained parameter):
FR1(1,1)
-0.7000
V(1,1)
0.2000
-------------}
*************************************************************

where the only error variance relevant for nest2thd is that of the nested model.
Example 3.9 (Multiple seasonal factors). Consider a time series observed once each ten days, z t ,
with the following structure:

/36 /9 /3 /z t
( 1 .6 B 36 ) ( 1 .7 B 9 ) ( 1 .8 B 3 ) ( 1 .9 B ) a t , )2a
.2
where

/36  ( 1 B 36 ) , /9
( 1 B 9 ) and /3
( 1 B 3 ) .

In this example, the THD representation can be obtained with:


[t1, d1, l1] = arma2thd([], [-1], [], [-.6], 1, 36);
[t2, d2, l2] = arma2thd([], [-1], [], [-.7], 1, 9);
[t3, d3, l3] = arma2thd([-1], [-1], [-.9], [-.8], .2, 3);
[ts1, ds1, ls1] = stackthd(t1, d1, t2, d2, l1, l2);
[ts2, ds2, ls2] = stackthd(ts1, ds1, t3, d3, ls1, l3);
[tn, dn, ln] = nest2thd(ts2, ds2, 0, ls2);
prtmod(tn, dn, ln);

and the corresponding output is:


*************************** Model ***************************
Nested model in errors (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 36
SS vector dimension: 49
Submodels:
{
VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 36
SS vector dimension: 36
Parameters (* denotes constrained parameter):
FS1(1,1)
-1.0000
&KDS  3DJ 

AS1(1,1)
-0.6000
-------------VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 9
SS vector dimension: 9
Parameters (* denotes constrained parameter):
FS1(1,1)
-1.0000
AS1(1,1)
-0.7000
-------------VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 3
SS vector dimension: 4
Parameters (* denotes constrained parameter):
FR1(1,1)
-1.0000
FS1(1,1)
-1.0000
AR1(1,1)
-0.9000
AS1(1,1)
-0.8000
V(1,1)
0.2000
-------------}
*************************************************************

'HILQLQJ FRPSRQHQW PRGHOV


A component model is defined in E4 following a three step procedure, very similar to that described
for nested models. In fact, Steps 1) and 2) are identical. Step 3) is similar, replacing the call to
nest2thd for a similar call to comp2thd. The syntax of this function is:
[theta, din, lab] = comp2thd(ts, ds, ls);

where the input arguments ts, ds, ls are the stacked THD format of the models to be composed.
Example 3.10 (Definition of an AR(2) model with observation errors). Assume that y t evolves
according with the following model:
(1

yt

.5 B .7 B 2 ) yt
at ; )2a
1.0

yt  vt y ; )2v
1.0
y

and the errors a t and v t are mutually independent white noise processes. The corresponding THD
format and the resulting output are:
[t1, d1, l1] = arma2thd([-.5 -.7], [], [], [], 1, 1);
[t2, d2, l2] = arma2thd([], [], [], [], 1, 1);
[ts, ds, ls] = stackthd(t1, d1, t2, d2, l1, l2);
[theta, din, lab] = comp2thd(ts, ds, ls);
prtmod(theta, din, lab);

&KDS  3DJ 

*************************** Model ***************************


Components model
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 2
Submodels:
{
VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 2
Parameters (* denotes constrained parameter):
FR1(1,1)
-0.5000
FR2(1,1)
-0.7000
V(1,1)
1.0000
-------------White noise model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 0
Parameters (* denotes constrained parameter):
V(1,1)
1.0000
-------------}
*************************************************************

Defining models with conditional heteroscedastic errors


Formulation of models with conditional heteroscedastic errors in THD format is similar to that of
composite models. First, it is necessary to obtain the THD formulation of: a) a VARMAX or
transfer function model for the mean and b) a VARMAX model equivalent to the ARCH, GARCH
or IGARCH structure desired. The full model can then be defined using the garc2thd function,
which has the following syntax:
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);

where t1-d1 is the THD format associated with the model for the mean, t2-d2 is the THD format
associated with the VARMAX model for the variance, and lab1 and lab2 are optional parameters
with labels for the parameters in t1 and t2, respectively.
Example 3.11 (Defining models with GARCH errors). Consider the following ARMA(2,1) model
with GARCH (1,1) errors, in conventional notation:
yt

1 .8 B

1 .7 B  .3 B

Jt ; Jt  iid ( 0 , .01 ) ; Jt 6t  iid ( 0 , h t ) ; h t


2

.002  .1 J2t 1  .7 ht2 1

which, in the ARMA representation supported by E4 becomes:


yt

1 .8 B

1 .7 B  .3 B

Jt , such that: Jt

.01  N t , (1 .8 B) N t
(1 .7 B) vt

see Example 2.5. The following code defines and displays the model structure:
&KDS  3DJ 

% Model for the mean


[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);
% Model for the conditional variance
[t2, d2, lab2] = arma2thd([-.8], [], [-.7], [], [.01], 1);
% Full model
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
prtmod(theta, din, lab);

generating the output:


*************************** Model ***************************
GARCH model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 2
Endogenous variables model:
VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 2
Parameters (* denotes constrained parameter):
FR1(1,1)
-0.7000
FR2(1,1)
0.3000
AR1(1,1)
-0.8000
V(1,1)
0.0100
-------------GARCH model of noise:
VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 1
Parameters (* denotes constrained parameter):
FR1(1,1)
-0.8000
AR1(1,1)
-0.7000
-------------*************************************************************

Assume now the same model for the mean and an IGARCH(1,1) for the conditional variance:
h t2
.002  .3 Jt 1  .7 ht2 1
2

which in ARMA form can be written as:


2

Jt

.01  N t , with: (1 B) N t
(1 .7 B) vt

The following commands define the IGARCH structure by constraining the autoregressive
parameter to unity:
% Model for the mean
[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);
% Model for the conditional variance
[t2, d2, lab2] = arma2thd([-1], [], [-.7], [], [.01], 1);
% Full model
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
theta1 = [theta zeros(size(theta))]; theta1(5,2)=1;
prtmod(theta1, din, lab);

&KDS  3DJ 

and the output from prtmod is:


*************************** Model ***************************
GARCH model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 2
Endogenous variables model:
VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 2
Parameters (* denotes constrained parameter):
FR1(1,1)
-0.7000
FR2(1,1)
0.3000
AR1(1,1)
-0.8000
V(1,1)
0.0100
-------------GARCH model of noise:
VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 1
SS vector dimension: 1
Parameters (* denotes constrained parameter):
FR1(1,1)
*
-1.0000
AR1(1,1)
-0.7000
-------------*************************************************************

Converting THD models to state-space representation


The conversion of THD models to SS representation is done by function thd2ss, which uses a
THD formulation as argument and returns the SS formulation matrices 0,
, E, H, D, C, Q, S and
R of (2.1)-(2.4). Due to its particular nature, models with GARCH errors are supported by an
specific function, garch2ss. The general calls to thd2ss and garch2ss are:
[Phi, Gam, E, H, D, C, Q, S, R] = thd2ss(theta, din)
[Phi,Gam,E,H,D,C,Q,Phig,Gamg,Eg,Hg,Dg] = garch2ss(theta, din)

The function garch2ss returns seven matrices (Phi, Gam, E, H, D, C and Q) corresponding to the
model for the mean, and five matrices (Phig, Gamg, Eg, Hg and Dg) corresponding to the model for
the conditional variance. Note that the matrices R and S are not returned, since they are the same as
Q in this class of models. For further details, on these functions, see Chapter 8.

Converting THD models to the standard representation


Once a model is estimated, it is sometimes convenient to obtain again the matrices characterizing its
standard representation. To do this, E4 includes three functions, thd2str, thd2arma and thd2tf,
&KDS  3DJ 

that translate a THD definition into the equivalent reduced form of the model. The general syntax of
these functions is:
[F, A, V, G] = thd2str(theta, din)
[F, A, V, G] = thd2arma(theta, din)
[F, A, V, W, D] = thd2tf(theta, din)

The main differences between the output arguments of these functions and the input arguments for
their reciprocals (arma2thd, str2thd and tf2thd) are:
1) The elements with fixed values in the formulation are also returned (e.g., identity matrix of a
VARMAX model).
2) If the model includes seasonal factors, the matrices returned are the product of the regular and
seasonal factors.
These functions do not work with SS models or composite models.

&KDS  3DJ 

4 Model estimation
After defining a model structure in THD format, many analyses require to obtain estimates of its
unknown parameters and the standard deviations of these estimates. This chapter describes the E4
functions that deal with these issues.
The first Section describes how to set or modify the default options and tolerances that affect
likelihood evaluation and optimization. The second Section is concerned with the functions required
to evaluate the log-likelihood function for all the models supported. The third Section deals with
computing the analytical gradient and the information matrix. The fourth Section discusses the
numerical optimization algorithm employed and the fifth section describes how the estimation results
can be combined in a summary report. A final Section illustrates the use of all these functions with
several examples.

Modification of toolbox options


The values of the general toolbox options are stored in an internal vector, E4OPTION, created by the
e4init command, see Chapter 1 and Appendix B. These options can be modified using the
function sete4opt, which allows three different calls:
1) sete4opt, without any argument, restores the default options and lists the E4OPTION vector.
2) sete4opt('show') shows current options. If the function is called with this argument, no
other argument should be included.
3) sete4opt(option, value, ...), where the argument option stands for the name of the
option to be modified, and value stands for the new choice. In this case, the following rules
apply:
6 option must be a character string, enclosed by quotes. It is enough to indicate the first three
letters.

6 value may be a character string, enclosed by quotes, or a numeric value. If it is a character


string, it is enough to indicate its first three letters
6 A single call may contain several option-value pairs, up to a maximum of ten.
The different options and their admissible values are summarized in the following table.
&KDS  3DJ 

Option

Description

Possible values

Functions that control the estimation process


Selects the filter used in the evaluation of the
likelihood function

'kalman',

'scale'

Scales matrices when computing their


Cholesky decomposition during filtering.

'no', 'yes'

'econd'

Selects the criterion to compute the initial


value of the state vector

'iu', 'au', 'ml',

Selects the criterion to compute the covariance


of the initial state vector

'lyapunov', 'zero',

Selects between estimation of the covariance


matrix or estimation of its Cholesky factor

'variance', 'factor'

'filter'

'vcond'

'var'

'chandrasekhar'

'zero', 'auto'

'idejong'

Functions that control the behaviour of eemin


'algorithm'

Chooses the optimization algorithm

'bfgs', 'newton'

'step'

Maximum step length during optimization

0.1

'tolerance'

Stop criteria tolerance

1.0e-5

'maxiter'

Maximum number of iterations

75

'verbose'

Displays output at each iteration

'yes', 'no'

Default option.
This is the default value. Other reasonable values are admissible.
Example 4.1. When issued after e4init, the command:
sete4opt('show');

displays the default options:


*********************** Options set by user ***********************
Filter. . . . . . . . . . . . . : KALMAN
Scaled B and M matrices . . . . : NO
Initial state vector. . . . . . : AUTOMATIC SELECTION
Initial covariance of state v. : IDEJONG
Variance or Cholesky factor? . : VARIANCE
Optimization algorithm. . . . . : BFGS
Maximum step length . . . . . . : 0.100000
Stop tolerance. . . . . . . . . : 0.000010
Max. number of iterations . . . :
75
Verbose iterations. . . . . . . : YES
****************************************************************

and the code:


sete4eopt('filt','chandra','vco','lyapunov','eco','ml','step',0.5);
&KDS  3DJ 

selects the Chandrasekhar filter, sets the initial conditions for the covariance matrix of the Kalman
filter at the solution of the corresponding Lyapunov equation and adjusts the maximum step length
to .5. The corresponding output is:
************** The following options are modified **************
Filter. . . . . . . . . . . . . : CHANDRA
Initial covariance of state v. : LYAPUNOV
Initial state vector. . . . . . : ML
Maximum step length . . . . . . : 0.500000
****************************************************************

Evaluation of the likelihood function

0RGHOV ZLWK KRPRVFHGDVWLF HUURUV


Starting from a model in THD format and a sample, the functions lfmod, lffast and lfmiss
compute the value of the gaussian log-likelihood function for any model with homoscedastic errors.
Their syntax is:
[l, innov, ssvect] = lfmod(theta, din, z)
[l, innov, ssvect] = lfmiss(theta, din, z)
[l, innov, ssvect] = lffast(theta, din, z)

The functions lfmod and lfmiss compute the log-likelihood of a standard sample and a sample
with missing values, respectively. Technical details about how they work are given in Terceiro
(1990, Chapter 4). The function lffast is a faster version of lfmod because it takes advantage of
the innovations structure of many econometric models; see Casals, Sotoca and Jerez (1999).
The input arguments of these functions are a THD format specification, theta, din, and the data
matrix z. Internally, each of these functions formulate the model in SS with thd2ss and then
compute the value of the following output arguments:
1) l, a scalar that contains the value of the log-likelihood function in theta,
2) innov, a matrix of one-step-ahead forecast errors, defined as:
zt |t 1

zt H xt | t 1 D ut

where xt |t 1 is an estimate of the state vector in t conditional to the information available up to


t-1.

&KDS  3DJ 

3) and ssvect, a matrix of estimates of the state variables. Its t-th row contains the filtered
estimate of the state vector at time t, conditional on the information available up to t-1:
xt1 |t

0 xt |t 1 
u t  Kt zt |t 1

For a detailed reference on these functions, see Chapter 8.

0RGHOV ZLWK *$5&+ HUURUV


The gaussian log-likelihood of models with GARCH errors is computed by function lfgarch,
which synopsis is:
[l, innov, hominnov, ssvect] = lfgarch(theta, din, z)

where all the arguments are the same as those of lfmod except hominnov, which is a matrix of
residuals standardized with the square root of the conditional variances.

,QLWLDO FRQGLWLRQV
To compute the exact log-likelihood function of a SS model it is necessary to define adequate initial
values for the state vector ( x 1 ) and its covariance matrix P1 . The selection of these values, see
Casals and Sotoca (1997), depends on two characteristics of the model:
1) Whether or not the model is stationary. A model is said to be a) totally stationary when all the
roots of 0 are, in module, less than one, b) totally nonstationary if all the roots of 0 have a
module greater or equal than one, and c) partially nonstationary if some roots of 0 are greater or
equal than one and others less.
2) Whether or not there are exogenous variables ( u t ) in the model, and their stochastic or
deterministic nature.
Adequate initial conditions for each model are chosen automatically. The default options can be
manually overridden using the function sete4opt. The options related with this issue are
'econd', which sets the initial condition for the state vector, and 'vcond', which does the same
for its covariance matrix. Admissible values for 'econd' are: 'auto', 'zero', 'iu', 'au' and
'ml'. Admissible values for 'vcond' are 'zero', 'lyap' and 'idej'. The next table
summarizes the adequate options for each case.

&KDS  3DJ 

Type of model

econd

Without exogenous variables

zero
Deterministic

Totally stationary

vcond

iu, au

lyap

With exogenous variables


Stochastic
Without exogenous variables

ml
zero

Deterministic

Totally nonstationary
With exogenous variables

idej
zero, iu, au

Stochastic
Without exogenous variables

zero
Deterministic

Partially nonstationary

iu, au

idej

With exogenous variables


Stochastic
Without exogenous variables
Models with GARCH
errors

ml
zero

Deterministic

iu, au

idej

With exogenous variables


Stochastic

ml

In the model for the mean


Denotes a fixed option, not modifiable by the user.

Computation of the gradient and the information matrix


The exact gradient and information matrix of the log-likelihood function are relevant to both, the
iterative estimation process and many inference procedures, see Engle (1984).

(YDOXDWLRQ RI WKH DQDO\WLFDO JUDGLHQW


The general syntax for the E4 functions dealing with gradient computation is:
g = gmod(theta, din, z)
g = gmiss(theta, din, z)
g = ggarch(theta, din, z)

As suggested by their names, gmod computes the derivatives of lfmod and lffast, gmiss
computes the derivatives of lfmiss and ggarch computes the derivatives of lfgarch. The input
arguments are the same as those lfmod, and the output argument, g, is a vector containing the
analytical derivatives of the corresponding log-likelihood evaluated at theta.

&KDS  3DJ 

Details about the gradient of the log-likelihood are given in Terceiro (1990, Appendix B). For a
complete reference on the use of these functions, see Chapter 8.

(YDOXDWLRQ RI WKH H[DFW LQIRUPDWLRQ PDWUL[

As in the case of the gradient, there are three functions dealing with computation of the information
matrix. They are:
[std, corrm, varm, Im] = imod(theta, din, z, aprox)
[std, corrm, varm, Im] = imiss(theta, din, z, aprox)
[std, corrm, varm, Im] = igarch(theta, din, z)

The function imod computes the information matrix of lfmod and lffast, imiss computes the
information matrix of lfmiss and igarch does the same for lfgarch. The input arguments are in
general the same as those of lfmod, with the following exception: there is an additional input
argument to imod and imiss, approx, which indicates whether the calculations should be exact or
approximate, being the latter option computationally more efficient, see Watson and Engle (1983).
The output arguments are std, the standard deviation of the values in theta, corrm, the
correlation matrix between these parameters, varm, which is the corresponding covariance matrix
and Im, which is the exact information matrix in the case of imod and imiss, see Terceiro (1990,
Appendices C and D) and Terceiro (1999). For a complete reference on the use of these functions,
see Chapter 8.

4XDVLPD[LPXP OLNHOLKRRG HVWLPDWLRQ

If the model is misspecified or its errors are nonnormal, optimization of the log-likelihood function
provides consistent (but not efficient) estimates for the parameters. In this case, we speak of quasimaximum likelihood estimates. This situation has an even more important consequence, as the
standard errors computed by imod and imiss are no longer adequate.
Ljung and Caines (1979) and White (1982) propose an analytical approximation to the information
matrix that is robust to these specification errors. It can be computed using the function imodg:
[std, stdg, corrm, corrmg, varm, varmg, Im] = ...
imodg(theta, din, z, aprox)

where the input arguments and the first four output arguments are the same as those of imod. The
additional outputs stdg, corrmg and varmg are the quasi-maximum likelihood values.

&KDS  3DJ 

Numerical optimization
Except in very simple formulations, the first-order conditions of a maximum likelihood problem are
a complex set of nonlinear equations. Therefore, its solution requires an iterative algorithm, see
Dennis and Schnabel (1983). The general iteration of an unconstrained numerical optimization
algorithm is:

i1
 i 'i W i g i
where  i is the vector of parameters to estimate in the i-th iteration, W i is a matrix that describes
the curvature of the function (often the inverse of the hessian or an approximation), g i is the
gradient of the objective function evaluated at  i and 'i is a scalar that determines the step length in
the

W i g i direction.

Different choices for the components of the previous expression characterize each specific
implementation. Also, it is necessary to define a criterion to stop the iterative process.

*HQHUDO XVH RI HPLQ


The E4 function e4min implements a numerical optimization procedure, based on the techniques
described by Dennis and Schnabel (1983). It includes two main optimization algorithms, BFGS
(Broyden-Fletcher-Goldfarb-Shanno) and Newton-Raphson. MATLAB functions fmin, fmins and
fminu can be used instead of e4min. However e4min has been carefully designed and tuned to

solve likelihood optimization problems and, in most cases, it should be more reliable and robust for
this specific use.
The general synopsis of e4min is:
[pnew,iter,fnew,gnew,hessin]=e4min(func,theta,dfunc,P1,P2,P3,P4,P5)

The operation of e4min is the following. Starting from an initial estimate of the parameters in
theta, the algorithm iterates on the objective function func, using a Newton-Raphson or BFGS
search direction. The iteration can be based on the analytical gradient or on a numerical
approximation, depending on whether dfunc contains the name of the analytical gradient function
or is an empty string ''. The step length is computed with the e4lnsrch function. Finally, the stop
criterion takes into account the relative changes in the values of the parameters and/or the size of the
gradient vector.
Parameters P1, ..., P5 are optional and, if specified, are feed to the objective function without
&KDS  3DJ 

modifications. In the context of E4 the first parameter, P1, may have the name of a user model, see
Chapter 6.
The function sete4opt manages different options which affect the optimizer, including the
algorithm to use, the maximum step length at each iteration, the tolerance for stop criteria and the
maximum number of iterations allowed.
Once the process is stopped, whether because convergence is reached or because of other causes
(exceeding maximum number of iterations or bad conditioning of the objective function) the function
returns the following values: pnew which is the value of the parameters, iter, the number of
iterations performed, fnew, the value of the objective function at pnew, gnew which is the analytical
or numerical gradient, depending on the contents of dfunc, and finally hessin, which is a
numerical approximation to the hessian.

6FDOLQJ SUREOHPV
When the parameters to be estimated have very different values, e.g., if they range from 10-4 to 104,
the number of iterations required increases and the numerical precision of the solution can be poor.
This problem often affects the variances of the errors, which may be several orders of magnitude
different from the rest of the parameters.
There are two main solutions for this problem. First, the data can be scaled so that the values of all
the parameters fall in a relatively small range, say from 10-2 to 102. The function e4preest can be
used to test quickly different scaling factors. If the scaling problem happens because the variances
are too small, another possibility consists of setting the option 'var' to the value 'factor' with
sete4opt. By so doing, optimization is done with respect to the Cholesky factors of the covariance
matrix, instead of the matrix itself. Besides improving the scale of the variances, in most cases, this
choice has the advantage that the covariance matrices are then constrained to be positive definite.

Preliminary estimates
If the initial value of theta is far from the optimum, the cost of iterative process can be very large.
It is then desirable to have a preliminary estimation algorithm, which provides initial estimates with
a reasonable computational burden.
The function e4preest computes consistent estimates of the parameters in theta. In most cases,
these are adequate starting values for likelihood optimization with e4min. The syntax for calling
this function is:

&KDS  3DJ 

theta2 = e4preest(theta, din, z)

where the input arguments are identical to those of lfmod. The estimates are returned in theta2.
The operation of e4preest is the following. It first obtains a subspace representation of the system
and then computes estimates for its parameters by solving a nonlinear least squares problem, which
computational load does not depend on the size of the sample. Hence, this method is very efficient
when processing large samples. For more details on subspace methods, see Viberg (1995) and
Casals (1997).

Displaying the estimation results


After obtaining maximum likelihood estimates for the parameters of a model and (optionally),
computing its standard errors, it is useful to obtain a summary of all these results. To this end, E4
includes the function prtest, which allows a reduced call:
prtest(theta, din, lab, z, it, lval, g, h)

where the input arguments are: the model specification given by theta-din-lab, the data matrix,
z, the number of iterations, it, the value of the log-likelihood, lval, its gradient, g and its hessian
h. All these arguments except din, lab and z, are output values of e4min.
A more complex call is:
prtest(theta, din, lab, z, it, lval, g, h, std, corrm, t)

where the additional arguments std and corrm should obtained with one of the functions dealing
with the computation of the information matrix. The last argument, t, is the total computing time in
minutes. It should be computed by the user using the MATLAB commands: tic and toc.
The first syntax does not require the computation of an analytical information matrix and, therefore,
it is useful to obtain a quick first impression on the model adequacy. The omitted input arguments,
std and corrm, are replaced internally by an approximation to the covariance matrix by inverting
the hessian of the log-likelihood function in the optimum. This method is very fast but its results
should be taken cautiously.
Finally, a third valid call is:
prtest(theta, din, lab, z, it, lval, g, h, [], [], t)

&KDS  3DJ 

which does not require results from the information matrix but displays the elapsed computing time.

Examples
The examples in this Section use simulated data generated with the E4 function simmod. Its use is
described in Chapter 5. See also the corresponding reference in Chapter 8.
Example 4.2 (Simulation and estimation of an ARMA model). Consider the model:
zt

( 1 .7 B ) ( 1 .5 B 12 ) at

V[ at ]

.1

The following code:


1) obtains the corresponding THD format,
2) simulates 250 observations of the model, discarding the first 50 samples,
3) computes preliminary estimates with e4preest, which are displayed using prtmod, and
4) computes maximum likelihood estimates using the numerical gradient:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);
z=simmod(theta,din,250); z=z(51:250,1);
theta=e4preest(theta, din, z);
prtmod(theta, din, lab)
[thopt, it, lval, g, h] = e4min('lffast', theta,'', din, z);

To use analytical derivatives in the optimization process, replace the last line with:
[thopt, it, lval, g, h] = e4min('lffast', theta,'gmod', din, z);

The estimation results can be displayed using numerical standard errors with:
prtest(thopt, din, lab, z, it, lval, g, h)

or using analytical standard errors (under normality) with:


[std, corrm, varm, Im] = imod(thopt, din, z);
prtest(thopt, din, lab, z, it, lval, g, h, std, corrm)

Finally, if the normality assumption is doubtful, one can display the results using robust standard
errors:

&KDS  3DJ 

[std, stdg, corrm, corrmg] = imodg(theta, din, z);


prtest(thopt, din, lab, z, it, lval, g, h, stdg, corrmg)

Example 4.3 (Simulation and estimation of an ARMA model with GARCH errors). Consider
the ARMA(2,1) model with GARCH(1,1) that we used in Example 3.11:
yt

Jt

1 .8 B
Jt , such that:
1 .7 B  .3 B 2

.01  N t , (1 .8 B) N t
(1 .7 B) vt

The following code defines the model structure, simulates a sample, computes the maximumlikelihood estimates of the parameters, the analytical standard errors and displays the results:
[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);
[t2, d2, lab2] = arma2thd([-.8], [], [-.7], [], [.01], 1);
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
z=simgarch(theta,din,450); z=z(51:400,1);
theta=e4preest(theta, din, z);
prtmod(theta, din, lab);
[thopt, it, lval, g, h] = e4min('lfgarch', theta,'', din, z);
[std, corrm, varm, Im] = igarch(thopt, din, z);
prtest(thopt, din, lab, z, it, lval, g, h, std, corrm)

In GARCH modeling the samples are usually large and, therefore, using the analytical gradient in
the iteration process is very expensive. However, to assure optimality one may want to evaluate it
after convergence. This can be done with the command:
g = ggarch(thopt, din, z)

&KDS  3DJ 

5 Specification, forecasting, simulation and


smoothing

Despite its focus on model estimation, E4 includes several functions implementing standard methods
for model building and model validation. These functions are described in the first Section of this
Chapter. The second Section reviews the forecasting techniques implemented in the toolbox. The
third Section describes the functions available for model simulation. Finally, the fourth and fifth
Sections deal with smoothing.

Tools for time series analysis


There are three groups of functions for time series analysis: a) general purpose functions, including
several standard graphs and descriptive statistics, b) data transformations and c) tools for model
specification and validation. This section provides a brief discussion of the functions in each group,
see Chapter 8 for a detailed reference on these functions.

*HQHUDO SXUSRVH IXQFWLRQV


The general purpose functions include: plotsers, which plots a centered and standardized time
series; histsers, which shows the histogram of a time series; rmedser, which displays a scaled
plot of sample means versus sample standard deviations; plotqqs, which plots the quantile graph
under normality; and descser, which presents a table of descriptive statistics. The general syntax
of these functions is:
ystd = plotsers(y, mode, lab)
freqs = histsers(y, lab)
[med, dts] = rmedser(y, len, lab)
[nq, yq] = plotqqs(y, lab)
stats = descser(y, lab)

The common input arguments are: y, a Nm matrix which contains m series of N observations each,
and lab, a matrix of characters which contains in each row an optional descriptive title for each
series. Besides:

&KDS  3DJ 

1) The function plotsers allow an optional input argument, mode, which selects the display
format.
2) The function rmedser allow an optional input argument, len, which defines the number of
observations to be used in computing sample means and standard deviations.
Further details about these functions can be consulted in Chapter 8.
Example 5.1 (Statistical analysis of gaussian white noise). The following commands simulate
(1502) draws from a N(0,1) distribution, eliminate the first 50 values, define titles for both series
and then call the different general purpose functions:
y=randn(150,2); y=y(51:150,:);
lab=['noise #1';'noise #2'];
ystd = plotsers(y, -1, lab);
ystd = plotsers(y, 1, lab);
freqs = histsers(y, lab);
[med, dts] = rmedser(y, 10, lab);
[nq, yq] = plotqqs(y, lab);
stats = descser(y, lab);

'DWD WUDQVIRUPDWLRQV
In time series analysis it is frequent to transform the data before model specification. E4 includes two
data transformation functions: lagser which returns a series lagged or leaded a specific number of
periods and transdif, which computes the Box-Cox (1964) transformations, seasonal and regular
differences for a time series.
The syntax of lagser is:
[yl , ys] = lagser(y, ll)

where y is an nk data matrix and ll is a 1l vector containing the list of lags (positive numbers)
and leads (negative numbers) applicable to all the series. The function returns yl, which contains
the lagged-leaded variables, and optionally ys, an nlk data matrix (nl=n-maxlag+maxlead) which
contains the original variables resized to be conformable with yl.
As for the differencing and Box-Cox transformation, the syntax of transdif is:
z = transdif(y, lambda, d, ds, s);

where the input arguments are: y, a matrix whose columns correspond to the different series to be
transformed, lambda, the parameter of the Box-Cox transformation, d, the order of regular

&KDS  3DJ 

differencing, ds, a S1 vector containing the orders of seasonal differencing (default ds=0) and s,
a S1 vector containing the lengths of the seasonal periods (default s=1). The last two parameters
are optional and can be omitted if seasonal differences are not required.
The output argument is the differenced and transformed series z such that:

zt

/d

N/
sS

ds ()
s yt

( 1 B )d

N (1 B )

s ds

s S

yt()

where S
s1 , s2 , , s S , /s is the difference operator of s-th order, such that
()
any sequence y t , and y t is defined as:
()
yt

ln ( yt  )
( yt  ) 1

if 

if 

g0

(5.1)

/s yt
yt yt s for

(5.2)

being null if all the values of y t are strictly positive, and equal to min( y t )  10 5 otherwise.

7RROV IRU PRGHO VSHFLILFDWLRQ DQG YDOLGDWLRQ


In regard to model specification and validation E4 includes four functions: augdft, which computes
the augmented Dickey-Fuller (1981) test for unit roots, see Hamilton (1994, Chapter 17); uidents,
which computes the univariate simple and partial autocorrelation functions of a series and plots the
results; midents, which computes the analogue multivariate specification statistics, that is, multiple
autocorrelation function and partial autoregression matrices; and residual, which computes the
residuals and smoothed error estimates of a model.
Any call to augdft has the following structure:
[adft] = augdft(y, p, trend);

The input arguments are, y, a matrix with N observations of m variables, p, the number of lags in
the unit root regression, and trend, an optional parameter to allow for a deterministic time trend (if
trend=1). When an output argument adft is specified, the function does not display the results,

but stores them in adft.


The syntax for calling uidents is:
[acf, pacf, Qs] = uidents(y, lag, tit)

&KDS  3DJ 

and for midents is:


[acf, prcf, Qus] = midents(y , lag, tit)

In both cases the input arguments are: y, a Nm matrix which contains m series of N observations
each, lag, the maximum lag for computing the values of the autocorrelation functions, and lab,
which is a matrix of characters which contains a descriptive title for each series. The output
arguments are the values of the empirical autocorrelation functions and the Box-Ljung Q single or
multiple statistic.
The function residual computes the residuals of a model and is used mainly for validation. Any
call to this function has the following format:
[z1, vT, wT, vz1, vvT, vwT] = residual(theta, din, z, stand)

The input arguments are a THD format specification, (theta- din) and a data matrix (z). The
optional parameter stand selects between standardized (stand=1) or ordinary values (stand=0 or
argument omitted).
The output arguments are z1, a matrix of residuals, vT, a matrix of smoothed observation errors,
wT, a matrix of smoothed state errors (standardized if stand=1), vz1, a matrix which stacks the

covariance matrices of z1, vvT, a matrix which stacks the covariance matrices of vT and vwT, a
matrix which stacks the covariance matrices of wT. All these values are standardized if stand=1.

Forecasting
One of the main practical uses of time series analysis is forecasting. Forecasting with E4 requires to
obtain the THD representation of the model, estimate its parameters (if required), select suitable
initial conditions for the filter, see Chapter 4, and then call the foremod function. This function has
the following syntax:
[yf, Bf] = foremod(theta, din, z, k, u)

where theta and din define the model structure in THD format, z is a data matrix containing the
values of the endogenous and exogenous variables, k is the forecast horizon and u contains the data
of the exogenous variables for the forecast horizon. The output arguments are forecasts of the
endogenous variables (yf) and its corresponding covariances (Bf).
Forecasts for model with GARCH errors are computed by the function foregarc. Its syntax is:
&KDS  3DJ 

[yf, Bf, vf] = foregarc(theta, din, z, k, u)

where the output argument yf contains the forecasts of the endogenous variable, Bf contains the
covariance matrices of these forecasts and vf contains forecasts of the conditional covariance.

Simulation
The functions simmod and simgarch generate a random sample from any model in THD format.
Their syntax is:
y = simmod(theta, din, N, u)
y = simgarch(theta, din, N, u)

The arguments are a model in THD format (theta, din), the number of observations to be
generated (N) and the exogenous variable data matrix (u). Both functions use the MATLAB function
randn to obtain N(0,1) random disturbances and select adequate initial conditions by themselves.
As a general practice, it is advisable to omit the first observations of the simulated sample.
Example 5.2 (Simulation). To obtain a realization with 200 observations of the model:
y1t
y2t

.9  .3 y1t 1  a1t

.7  .4 y1t 1  a2t .8 a2 t 4

a1t
a2t

1 .9
.9 1

the following code can be used:


[theta, din, lab] = arma2thd([-.3 NaN; -.4 NaN],[],[], ...
[NaN NaN; NaN -.8], [1 .9; .9 1], 4, [.9;.7], 1);
% Generate the exogenous (constant) variable
u = ones(250,1);
% Compute the simulated sample and omit the first observations
y = simmod(theta, din, 250, u);
y=y(51:250,:)

&KDS  3DJ 

Smoothing
E4 includes three functions, fismod, fismiss and aggrmod that implement different specialized
versions of the fixed interval smoother algorithms, see Anderson and Moore (1979), De Jong (1989)
and Casals, Jerez and Sotoca (2000). The main econometric applications of these functions are in
cleaning a sample contaminated with observation errors, computing the unobservable components
of a structural time series model, see Chapter 2, interpolating missing values and disaggregating a
low frequency sample.
The syntax of fismod and fismiss is:
[xhat, Px, e] = fismod(theta, din, z)
[zhat, Pz, xhat, Px] = fismiss(theta, din, z)

Both functions receive a model in THD format (theta, din) and a data matrix (z). The function
fismiss allows for missing values in z, which should be marked by NaN.
The output arguments of fismod are xhat, the expectation of the state vector conditional on all the
sample, Px, the covariance matrix of this expectation and e, a matrix of smoothed errors. On the
other hand, fismiss has two additional arguments, zhat, which is a matrix containing the
available values of z series and smoothed estimates of its missing values and Pz, which is the
covariance of zhat.
A common application of fixed-interval smoothing consists of computing the optimal disaggregation
of low frequency (say yearly) samples of flow variables into high frequency (say quarterly or
monthly) time series, so the disaggregates add up to the sample data. The unobserved high frequency
values can be computed taking into account, not only the low frequency sample information, but also
high frequency indicator(s). For example, a monthly industrial production index can be used as an
indicator to disaggregate a yearly series of GNP.
This disaggregation is performed by aggrmod, which synopsis is:
[zhat, Bt] = aggrmod(theta, din, z, per, m1)

where the input arguments are a THD model definition (theta, din) relating all the variables in
the high frequency observation interval, the data matrix (z), the number of observations that add up
to an aggregate (per) and the number of endogenous variables that are observed as aggregates (m1).
The output arguments are the optimal disaggregates of the first endogenous variables (zhat) and the
corresponding covariances (Bt).
&KDS  3DJ 

Example 5.3 (Residual analysis, forecasting and smoothing). Consider again the model in
Example 4.2, which was simulated and estimated with the following code:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);
z=simmod(theta,din,250); z=z(51:250,1);
theta=e4preest(theta, din, z);
[thopt, it, lval, g, h] = e4min('lffast', theta,'', din, z);
[std, corrm, varm, Im] = imod(thopt, din, z);
prtest(thopt, din, lab, z, it, lval, g, h, std, corrm)

After computing the maximum-likelihood estimates, the residuals can be obtained and analyzed with
the commands:
ehat = residual(thopt, din, z);
descser(ehat,'residuals');
plotsers(ehat,-1,'residuals');
uidents(ehat,10,'residuals');

One may want also to calculate out-of-the-sample forecasts. The following code computes and plots
10 forecasts, with the standard 2) limits:
[zfor, Bfor] = foremod(thopt, din, z, 10);
% The following is standard MATLAB code :
figure;
whitebg('w');
hold on
plot([z(191:200); zfor],'k-')
plot([z(191:200); zfor+2*sqrt(Bfor)],'k--')
plot([z(191:200); zfor-2*sqrt(Bfor)],'k--')
xlabel('Time')
hold off

Finally, some samples have missing values due to, e.g., holidays or discontinuities in the source.
Also, one may want to eliminate some observations because they are considered outliers and may
affect the analysis. The following code generates a new variable with two missing values,
interpolates them, using fismiss and displays the results:
z1=z; z1(10)=NaN; z1(40)=NaN;
[zhat, pz] = fismiss(thopt, din, z1);
[z(1:50) z1(1:50) zhat(1:50)]

&KDS  3DJ 

6 User models
The architecture of E4 makes easy to accommodate new formulations, provided that they can be
expressed in an equivalent SS form. To do this, the user should define a user model. This capacity
is very useful in three different cases:
First, if the user model has no close relationship with any of the formulations supported by E4, see
Chapter 2, the user should code the functions that generate the model in SS form and, if required,
compute its derivatives. This situation is discussed in the first section of this Chapter.
Second, some analyses require the use of reparametrized models. That is, models which contain
some parameters that are functions of the parameters in the standard formulation and, therefore,
differ slightly from a model supported by E4. In this case, the toolbox functions can be used to
simplify the definition of a user model, as the second section describes in detail.
Last, the reparametrization of a standard model, combined with fixed-value constraints, allows one
to impose general linear and non-linear equality constraints on the parameters.

Defining user models in the general case


The implementation of a user model can be divided into four steps:
Step 1: Generate a THD format for the model.
The results of this process should be
1) A theta vector, which contains the initial value of the parameters and, optionally, a second
column with the fixed/free-parameter flags, see Chapter 3.
2) A din vector. [This part is deliberately omitted].
3) An optional lab matrix, to document the contents of theta.

&KDS  3DJ 

Step 2: Create a MATLAB function to generate the SS matrices corresponding to the user
model.
The header of this function should be:
[Phi, Gam, E, H, D, C, Q, S, R] = userf1(theta, din)

where userf1 can be any name selected by the user. This function receives theta and din as
input arguments and return the SS matrices: 0 ,
, E, H, D, C, Q, S and R.
Step 3: If required, create a user function to generate the derivatives of the SS matrices.
If the exact information matrix and/or the analytical gradient of the model are required, it is
necessary to create a second user function, whose synopsis should be:
[dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = userf2(theta,din,i)

where userf2 can be again any name selected by the user. This function receives theta, din and
i as arguments, and returns the first-order partial derivatives of the SS matrices with respect to the
i-th parameter in theta.
Step 4: Invoke the functions required to complete the analysis.
[ This part is deliberately omitted ]
Finally, take into account that the user models are defined by means of standard MATLAB
functions. They should be saved in ASCII files, with the names userf1.m and userf2.m, and
stored in the active directory or in a directory appearing in the variable MATLABPATH.
Example 6.1 (Structural time series models): Consider the following decomposition of a time
series into trend, cycle and irregular noise:
yt

Tt  Ct 

Jt

(6.1)

where y t is an observable variable, T t and C t are, respectively, unobservable stochastic components


of trend and cycle and Jt is a white noise error. Assume also that the model for the trend is:
Tt
St

Tt 1  St 1

St 1  !t

(6.2)

and the cycle is governed by:

&KDS  3DJ 

Ct

Ct

'

Ct 1

cos  sin 

sin  cos  Ct 1

t
(6.3)

 t

where ' is the damping factor and  is the frequency of the cycle in radians, such that ( 0    % ) .
2%
The period in time units is p

. If 
0 or 
% the stochastic cycle degenerates to a first-

order autoregressive process. The errors Jt , !t , t and t are independent white noise processes

such that V( t )
V( t ) . See Harvey and Shephard (1993).
Expressions (6.1)-(6.3) can be written in SS form as:
Tt1
St1
Ct1

1 1


Ct1

0 1

Tt

St

0 0 ' cos 

' sin  Ct
0 0 ' sin  ' cos  C
t

0 0 0

1 0 0
0 1 0
0 0 1

!t1
t1

(6.4)

 t1

Tt
yt

1 0 1 0

St
Ct

 Jt

(6.5)

Ct

and the covariance matrices of the errors are:

)2!
Q

0 )
2

0 )
2

0 ,

)2J

(6.6)

In this formulation, the parameters to be estimated are, therefore, ' ,  , )! , ) and )J . Model (6.4)(6.6) can be defined by the following user function:
2

function [Phi, Gam, E, H, D, C, Q, S, R] = trendmod(theta, din)


% Obtains the SS formulation from the values in theta:
% theta(1) = rho,
% theta(2) = lambda,
% theta(3:5) standard deviations of xi, kappa and epsilon
rho
= theta(1,1);
lambda = theta(2,1);
rcl = rho*cos(lambda);
rsl = rho*sin(lambda);
Phi = [1 1 0 0; 0 1 0 0; 0 0 rcl rsl; 0 0 -rsl rcl];
Gam = [];
E
= zeros(4,3); E(2:4,:) = eye(3);
H
= [1 0 1 0];
D
= [];
C
= 1;
&KDS  3DJ 

v
Q
S
R

=
=
=
=

[theta(3,1); theta(4,1); theta(4,1)];


diag(v.^2);
zeros(3,1);
theta(5,1).^2;

This code should be stored in the ASCII file trendmod.m.


2
Finally, the following code simulates a sample with '
.5 , 
% , )!
.01 , )
.1 and )J
1.2 ,
obtains maximum likelihood estimates of the parameters, constraining  to its true value, and
2

obtains smoothed estimates of the unobservable components T t , C t and Jt .


e4init
% Create
theta =
lab
=
% Create
mtype =
m
=
r
=
s
=
n
=
np
=
usflag =
usfunc =
innov =

theta and lab


[.5; pi; sqrt(.01); sqrt(.1); sqrt(1.2)];
str2mat('Rho','Lambda','sd(xi)','sd(kappa)','sd(eps)');
din (only public header)
7; % model type = 7 (SS)
1; % endogenous variables
0; % exogenous variables
1; % seasonal period
4; % number of states
5; % number of parameters (rows of theta)
1; % flag for user models (yes)
'trendmod'; % name of user function (no gradient required).
[0;3;1]; % innov(1), not innovation model; innov(2), size of Q;
% innov(3), size of R
szpriv = [0;0]; % szpriv(1) size of private din, szpriv(2) size of
private
% header
din
= e4sthead(mtype,m,r,s,n,np,usflag,usfunc,innov,szpriv);
% Constrain the value of lambda and display the resulting model
theta=[theta zeros(size(theta))]; theta(2,2)=1;
prtmod(theta, din, lab);
% Select adequate initial conditions for the filter variables
sete4opt('econd','zero','vcond','idej','var','fac');
% Simulate the data, discarding the first 50 samples
y = simmod(theta,din,150);
y = y(51:150);
% Compute ML estimates of the unknown parameters
[thopt,it,lval,g,h] = e4min('lffast',theta,'',din,y);
prtest(thopt,din,lab,y,it,lval,g,h);
disp(sprintf('Period
= %4.2f', (2*pi)/thopt(2,1)));
disp(sprintf('Damping factor = %4.2f', thopt(1,1)));
% Obtain estimates of the unobserved components and plot the results
[xhat,px,ehat]=fismod(thopt,din,y);
plotsers([y,xhat(:,1)],1,str2mat('Data','Trend'));
plotsers(xhat(:,3),-1,'cycle');
plotsers(ehat,-1,'irregular component');
% Note that the name of the user function was feed to simmod,
% e4min and fismod

Defining user models in reparametrized formulations


Some analyses require the use of a model with a nonstandard parametrization, which has the same

&KDS  3DJ 

dynamics as a formulation supported by E4 and, therefore, the same SS representation. In this case,
we speak of a reparametrized model, and the steps 1, 2 and 3 of the process described in previous
section can be drastically simplified by using some E4 functions.
The whole process of user model definition can be again divided in four steps:
Step 1: Generate the reparametrized model description in THD format.
1.1) Generate the standard formulation of the model in THD format: oldtheta, olddin,
oldlab.
1.2) Define the vector theta, which should contain values coherent with those of oldtheta, but
in terms of the new parameters.
1.3) Mark din as user model adding user function names (userf2.m is optional)
din=touser(olddin, 'userf1', 'userf2')

1.4) Optionally, create a vector lab, documenting the contents of theta.


Step 2: Create a function to generate the SS matrices corresponding to the reparametrized
model.
Same as Step 2 of the general case but, to simplify it, one can use the toolbox functions for
formulation of SS models (e.g., thd2ss, see Chapter 3).
Step 3: If required, create a user function to generate the derivatives of the SS matrices.
Same as Step 3 of the general case but, to simplify it, one can use the toolbox functions for
computing the derivatives of an SS model (e.g., ss_dv or ss_dvp, see Chapter 8).
Step 4: Same as Step 4 of the general case.
Example 6.2 (Reparametrized transfer functions): Consider the transfer function:
yt

 B

ut

 at ,

and its corresponding steady-state gain, defined as: g

E(at )
)a
2

(6.7)

7
1

For many analyses the transfer function gain is more relevant than any other parameter. If this is the
case, the nonstandard parametrization:
yt

&KDS  3DJ 

g (1  ) u t  at ,
1  B

E(at )
)a
2

(6.8)

includes the gain as an explicit parameter. Note that the SS representation of models (6.7) and (6.8)
is the same. Therefore, (6.8) is a reparametrization of (6.7) and its definition can be done with the
help of the Toolbox functions. On the other hand, estimation results are not the same because (6.8)
allows one to obtain direct estimates of the gain and its standard deviation, and also to define
constraints on this parameter.
Step 1, as defined in previous section, can be done as follows. First, formulate the transfer function
(6.7):
[oldtheta, olddin, oldlab] = tf2thd([],[],[],[],.1,1,[.8],[-.4]);

where oldtheta is [.8; -.4; .1]. Then, theta can be generated as follows:
theta = oldtheta;
theta(1,1) = oldtheta(1,1)/(1+oldtheta(2,1));

so the first element of theta is equal to the gain, and the model should be marked as a user model
calling the touser function:
din = touser(olddin, 'mymodel', 'mymoddv');

Note that we will need the functions mymodel.m and mymoddv.m


Last, we generate a vector of descriptive labels with the following sentence:
lab=str2mat('gain',oldlab(2:3,:));

Step 2 consist of creating a function that receives theta and din and returns the matrices of the
SS formulation of (6.8). In this case, the SS formulation of (6.7) and (6.8) is the same. Hence, the
easiest way to code this function consist of rebuilding oldtheta and olddin from theta and din
and generating the SS model with a call to thd2ss. The code to do this is:
function [Phi, Gam, E, H, D, C, Q, S, R] = mymodel(theta, din)
% Converts theta to oldtheta
g = theta(1,1); delta = theta(2,1);
oldtheta = theta;
oldtheta(1,1) = g*(1+delta);
% Eliminates in din the user model flag
olddin = tomod(din);
% Obtains the SS matrices
[Phi, Gam, E, H, D, C, Q, S, R] = thd2ss(oldtheta, olddin);

Step 3 consists of creating a function that receives theta, din and i and returns the derivatives of
the SS matrices with respect to the i-th parameter of theta. Because of the equivalence between the
SS representations of (6.7) and (6.8) the easiest way to do this consist of: a) obtaining again
oldtheta and olddin from theta and din, b) defining the Jacobian of the reparametrization:

&KDS  3DJ 

07
0g

Jf user (  )

07
0

07
0 )a
2

0
0g

0
0

0
0 )a

0 )a
2

0 )a
2

0 )a

0g

0 )a

1

g 0

1 0

0 1

and c) generating the derivatives with a call to ss_dvp, see Chapter 8. This can be done with the
following code:
function [dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = mymoddv(theta,din,i)
% Returns the derivatives in SS formulation of the reparametrized model
% with respect to the i-th parameter of theta
g = theta(1,1); delta = theta(2,1);
oldtheta = theta;
oldtheta(1,1)=g*(1+delta);
olddin = tomod(din);
% Define the Jacobian
J = [1 + delta, g, 0; ...
0,
1, 0; ...
0,
0, 1];
% Derive with respect to the i-th column of J
[dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = ss_dvp(oldtheta, olddin, J(:,i));

In Step 4 the reparametrized model can be used for estimation, simulation, forecasting or any other
analysis. The complete code for an analysis with simulated data is the following:
e4init
sete4opt('econd','ml','vcond','lyap');
% *** Step 1
[oldtheta, olddin, oldlab] = tf2thd([],[],[],[],.1,1,[.8],[-.4]);
prtmod(oldtheta,olddin,oldlab);
% Data generation
u=randn(nobs,1);
y=simmod(oldtheta, olddin, 250, u);
u=u(51:250); y=y(51:250);
% Reparametrization
theta = oldtheta;
theta(1,1) = oldtheta(1,1)/(1+oldtheta(2,1));
din = touser(olddin, 'mymodel','mymoddv');
lab=str2mat('gain',oldlab(2:3,:));
prtmod(theta,din,lab);
% *** Step 4
% Compute ML estimates of the gain
% Note that the name of the user functions are feed to e4min and imod
[thopt,it,lval,g,h]=e4min('lffast',theta,'',din,[y u]);
[std,corrm,varm,Im]=imod(thopt,din,[y u]);
prtest(thopt,din,lab,[y u],it,lval,g,h,std,corrm);

Once the gain becomes an explicit parameter, its value can be constrained, see Chapter 3. For
example, the following code constrains the gain to its true value (which is known because the data
has been simulated) and reestimates the model:
% Constrain the gain to its true value
&KDS  3DJ 

theta=[theta zeros(size(theta,1),1)]; theta(1,2)=1.;


prtmod(theta,din,lab);
% ... and then compute new estimates.
% Note that the derivatives user function is now different
din = touser(olddin, 'mymodel','mymoddvr');
[thopt,it,lval,g,H]=e4min('lffast',theta,'',din,[y u]);
[std,corrm,varm,Im]=imod(thopt,din,[y u]);
prtest(thopt,din,lab,[y u],it,lval,g,H,std,corrm);

Note that the function to obtain the SS matrices (mymodel) is the same as in the previous case, but
the derivatives function is different because of the constraints. Now it should be:
function [dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = mymoddvr(theta,din,i)
% Returns the derivatives in SS formulation of the reparametrized model
% with respect to the i-th parameter of theta
% In this version, the gain value is constrained
g = theta(1,1); delta = theta(2,1);
oldtheta = theta;
oldtheta(1,1)=g*(1+delta);
olddin = tomod(din);
% Define the constrained Jacobian
J = [0 1 0; 0 0 1];
% Derive with respect to the i-th column of J
[dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = ss_dvp(oldtheta, olddin, J(:,i));

&KDS  3DJ 

7 Case studies
This chapter presents a set of case studies that illustrate the main features of E4. Most of these
examples cannot be processed by standard econometric packages but are easy to analyze with this
toolbox.
The first block of cases concentrates on the estimation of some common time-series models,
sometimes with not-so-common features. Thus, the first example applies E4 to the ARIMA modeling
of four well-known time series, comparing the estimates obtained with those computed with other
software packages. The second case focus in VARMA modeling of the famous mink-muskrat series.
The third example shows how to estimate a standard transfer function, which is afterward
reparametrized to obtain direct estimates of a parameter - the noise model period - that is not
explicitly included in the conventional formulation. It also indicates how to obtain forecasts from a
composite formulation. In the fourth example a simultaneous equations econometric system is first
written in structural (str) form and then estimated by maximum likelihood, both with a complete
sample and with some artificial missing data. Last, the fifth example illustrates the specification and
estimation of models with GARCH errors.
Interpolation and extrapolation is illustrated in the second block of cases. Thus, the sixth example
shows how to use the Toolbox functions to forecast a time series of number of airline passengers and
compute short term objectives consistent with a medium term target value and with the series
dynamics. This application has a clear interest for management, as intermediate objectives provide
an effective way to monitor the progressive fulfilment of the target. The seventh case deals with
estimation of relationships between unequally spaced time series and shows how they can be used to
disaggregate a yearly time series into higher frequency data.
The last block of cases centers on models with structure close to the SS formulation. Therefore, the
eight example is referred to the estimation of models with observation errors. Finally, the ninth
example shows how to define and estimate an structural time series model in direct SS form.
All the final estimates are computed by exact maximum likelihood. In most cases, the optimization
algorithm (e4min) is started from preliminary estimates computed with e4preest except in the
fourth example, in which the sample is too short.
A first-time reader should concentrate on block one, skipping the more complex parts of the second
and third examples. This should be enough to provide a good start in E4.

&KDS  3DJ 

Univariate ARIMA examples


Even the simplest modeling exercise depends crucially of the software used, see McCullough and
Vinod (1999). In a paper titled Adventures with ARIMA software Newbold et al. (1994)
illustrated this idea by fitting ARIMA models, using different software packages, to the following
monthly series:
Series A: An index of electricity consumption.
Series B: Housing starts.
Series C: Housing sales.
Series D: The monthly sales of a company.
The first three series can be found in Pankratz (1991) and were represented by the following
MA(1)(1)12 model:
z t
( 1   B ) ( 1   B 12 ) Jt

(7.1)

Series D was taken from Chatfield and Prothero (1973) and can be represented by the following
ARMA(1,0)(0,1)12 model:
( 1  1 B ) z t
( 1   B 12 ) Jt

(7.2)

In all cases z t
( 1 B ) ( 1 B 12 ) xt , x t is the data in logs, when this transformation is required, B is
the backward shift operator and Jt is and error assumed to be white noise.
The input code required to define and estimate the models for series A, B and C is:
% *** Series A. First read and transform the data
load seriesa.dat;
y=transdif(seriesa,0,1,1,12);
% Formulation and preestimation of the univariate model
[theta,din,lab]=arma2thd([],[],[0],[0],0,12);
theta=e4preest(theta,din,y);
prtmod(theta,din,lab);
% Optimize the likelihood function, compute the information matrix
% and print the results
[theta,it,lval,g,h]=e4min('lffast',theta,'',din,y);
[std,corrm,varm,Im]=imod(theta,din,y);
prtest(theta,din,lab,y,it,lval,g,h,std,corrm);
% *** Series B. Note that we do not define the THD model structure
% corresponding to series B and C, as it coincides with that of
% series A
load seriesb.dat;
y=transdif(seriesb,0,1,1,12);
theta=e4preest(theta,din,y);
prtmod(theta,din,lab);
&KDS  3DJ 

[theta,it,lval,g,h]=e4min('lffast',theta,'',din,y);
[std,corrm,varm,Im]=imod(theta,din,y);
prtest(theta,din,lab,y,it,lval,g,h,std,corrm);
% *** Series C
load seriesc.dat;
theta=e4preest(theta,din,y);
prtmod(theta,din,lab);
[theta,it,lval,g,h]=e4min('lffast',theta,'',din,y);
[std,corrm,varm,Im]=imod(theta,din,y);
prtest(theta,din,lab,y,it,lval,g,h,std,corrm);

and the corresponding outputs are:


For series A:
******************** Results from model estimation ********************
Objective function: -252.5742
# of iterations:
11
Information criteria: AIC = -2.9889, SBC = -2.9329
Parameter
AR1(1,1)
AS1(1,1)
V(1,1)

Estimate
-0.7218
-0.8242
0.0511

Std. Dev.
0.0549
0.0740
0.0029

t-test
-13.1555
-11.1418
17.7527

Gradient
0.0001
0.0001
-0.0001

************************* Correlation matrix **************************


AR1(1,1)
1.00
AS1(1,1)
-0.01 1.00
V(1,1)
0.01 0.24 1.00
Condition number =
1.6235
Reciprocal condition number =
0.6481
***********************************************************************

For series B:
******************** Results from model estimation ********************
Objective function: -113.6077
# of iterations:
11
Information criteria: AIC = -1.8590, SBC = -1.7889
Parameter
AR1(1,1)
AS1(1,1)
V(1,1)

Estimate
-0.2699
-1.0000
0.0825

Std. Dev.
0.0888
3663.0371
151.1795

t-test
-3.0408
-0.0003
0.0005

Gradient
0.0000
0.0000
-0.0002

************************* Correlation matrix **************************


AR1(1,1)
1.00
AS1(1,1)
0.00 1.00
V(1,1)
0.00 1.00 1.00
Condition number = 3193548113.3231
Reciprocal condition number =
0.0000
***********************************************************************

And for series C:


******************** Results from model estimation ********************
Objective function: -113.6077
# of iterations:
11
Information criteria: AIC = -1.8590, SBC = -1.7889
Parameter
AR1(1,1)
&KDS  3DJ 

Estimate
-0.2699

Std. Dev.
0.0888

t-test
-3.0408

Gradient
0.0000

AS1(1,1)
V(1,1)

-1.0000
0.0825

3663.0371
151.1795

-0.0003
0.0005

0.0000
-0.0002

************************* Correlation matrix **************************


AR1(1,1)
1.00
AS1(1,1)
0.00 1.00
V(1,1)
0.00 1.00 1.00
Condition number = 3193548113.3231
Reciprocal condition number =
0.0000
***********************************************************************

Note that the estimates of the last two models have a unit root in the seasonal moving average factor.
This is a clear symptom of overdifferencing and would require the cancellation of the seasonal
difference and the seasonal moving average factor.
Finally we will model series D with a richer residual diagnostic output. The code to estimate model
(7.2) and compute some standard diagnostic tests is the following:
% Series D
load seriesd.dat;
seriesd=log10(seriesd);
y=transdif(seriesd,1,1,1,12);
[theta,din,lab]=arma2thd([0],[],[],[0],0,12);
theta=e4preest(theta,din,y);
prtmod(theta,din,lab);
[theta,it,lval,g,h]=e4min('lffast',theta,'',din,y);
[std,corrm,varm,Im]=imod(theta,din,y);
prtest(theta,din,lab,y,it,lval,g,h,std,corrm,(toc/60));
[e,vt,wt,ve]=residual(theta,din,y);
titD='residuals from series D';
descser(e,titD);
plotsers(e,0,titD);
uidents(e,20,titD);

and the corresponding the estimation and diagnosis output is:


******************** Results from model estimation ********************
Objective function: -72.2362
# of iterations:
15
Information criteria: AIC = -2.1636, SBC = -2.0624
Parameter
FR1(1,1)
AS1(1,1)
V(1,1)

Estimate
0.4531
-0.7270
0.0729

Std. Dev.
0.1119
0.2023
0.0075

t-test
4.0492
-3.5928
9.7303

Gradient
0.0000
0.0000
0.0000

************************* Correlation matrix **************************


FR1(1,1)
1.00
AS1(1,1)
0.00 1.00
V(1,1)
-0.01 0.51 1.00
Condition number =
3.0836
Reciprocal condition number =
0.3563
***********************************************************************
*****************

Descriptive statistics

*****************

--- Statistics of
Valid observations
Mean
Standard deviation
Skewness
Excess Kurtosis

residuals from series D --=


64
= -0.0009, t test = -0.1039
=
0.0680
=
0.1186
= -0.1370
&KDS  3DJ 

Quartiles
Minimum value
Maximum value
Jarque-Bera
Dickey-Fuller
Dickey-Fuller
Outliers list
Obs #
25
51

=
=
=
=
=
=

-0.0490,
-0.1367,
0.1787,
0.2000
-4.3217,
-7.5422,

-0.0007,
0.0400
obs. #
11
obs. #
51
computed with
computed with

8 lags
1 lags

Value
0.1442
0.1787

************************************************************
Standardized plot of residuals from series D

A.C.F. of residuals from series D, LBQ = 34.59

2.5

0.5

1.5

-0.5

0.5
-1
0
-0.5

6
8
10
12
14
16
P.A.C.F. of residuals from series D

18

-1

0.5

-1.5
0
-2
-0.5

-2.5
10

20

30

40

50

60

-1

10

12

14

16

18

The following table summarizes previous results and compares them with the ML1 and ML2
estimates reported in Newbold et al. (1994):

Series A

Series B

Series C

Series D

ML1

-.693
(.057)

-.803
(.076)

-.270
(.087)

-.967
(.601)

-.200
(.086)

-.967
(.724)

.454
(.114)

-.725
(.195)

ML2

-.694

-.804

-.269

-1.000

-.216

-1.000

.453

-.727

(.056)

(.075)

(.085)

(153.6)

(.083)

(57.3)

(.113)

(.194)

Package

E4

-.722
-.824
-.270
-1.000
-.270
-1.000
.453
-.727
(.055)
(.074)
(.089)
(3663.0) (.089) (3663.0) (.112)
(.202)
Notes: Figures in parentheses are standard errors. The ML1 and ML2 estimates are the opposite of
those reported in Newbold et al. (1994), to make them coherent with E4 standards.
The following commands compute and display six-months ahead forecasts in logs and levels, as in
Newbold et al. (1994, table 3):
% Compute six months ahead forecasts
% Note that the nonstationary version of the model is used
phi=[-1+theta(1) -theta(1)];
sphi=-1;
sth=theta(2);
&KDS  3DJ 

v=theta(3);
[thetaf,dinf,labf]=arma2thd([phi],[sphi],[],[sth],[v],12);
prtmod(thetaf,dinf,labf);
[yf,bf]=foremod(thetaf,dinf,seriesd,6);
% Forecasts of log sales
[(1:6)' yf bf]
% Forecasts of sales
10.^yf

and the corresponding outputs are:


*************************** Model ***************************
VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 12
SS vector dimension: 14
Parameters (* denotes constrained parameter):
FR1(1,1)
-0.5469
FR2(1,1)
-0.4531
FS1(1,1)
-1.0000
AS1(1,1)
-0.7270
V(1,1)
0.0729
*************************************************************
ans =
1.0000
2.0000
3.0000
4.0000
5.0000
6.0000

2.4513
2.6300
2.7402
2.9335
3.0507
3.0775

0.0054
0.0070
0.0100
0.0124
0.0150
0.0175

ans =
1.0e+003 *
0.2827
0.4266
0.5498
0.8579
1.1238
1.1952

The code and data required to replicate this case can be found in the directory
\EXAMPLES\NEWBOLD of the distribution diskette, files newbold.m, seriesa.dat,
seriesb.dat, seriesc.dat and seriesd.dat.

VARMA modeling: interaction between minks and muskrats


The number of muskrat ( z1 t ) and mink ( z2 t ) skins traded annually by the Hudsons Bay Company
from 1848 to 1909 is a standard benchmark for multivariate methods, as the feedback interaction
between these series arises clearly from the fact that the mink is an important predator of the
muskrat. After a cross-correlation analysis, Jenkins and Alavi (1981) propose two alternative
VARMA models for the relationship between both series. The first is:

&KDS  3DJ 

1  11 1 B  11 1 B 2  11 1 B 3  11 1 B 4
1

/ log z1 t

1  112 2 B  122 2 B 2 log z


2t
2

(7.3)

1  1 1 B

a1 t

11 2 B

12 1 B

1  12 2 B

a2 t

and the second:


1  11 1 B  11 1 B 2
1

1
21B

1

2
2
21 B

/ log z1 t

111 2 B  121 2 B 2

11

1
22 B

1

2
2
22B

1

3
3
22 B

log z2 t 2

a1 t

(7.4)
a2 t

The estimation of the first model with E4 requires the code:


% Bivariate modelling of the mink-muskrat series
% Model (5.8) of Jenkins and Alavi (1981) pag. 37
% The data is already in logs
e4init
load mink.dat;
z1=transdif(mink(:,3),1,1);
z2=mink(:,2)-mean(mink(:,2));
z=[z1 z2(2:62)];
% Define the parameter matrices and generate
% the THD representation
phi1 = [0 NaN; NaN
0];
phi2 = [0 NaN; NaN
0];
phi3 = [0 NaN; NaN NaN];
phi4 = [0 NaN; NaN NaN];
theta= [0
0;
0
0];
sigma= [0
0;
0
0];
[theta,din,lab]=arma2thd([phi1 phi2 phi3 phi4],[],[theta],[],sigma,1);
% Compute preliminary estimates
theta=e4preest(theta,din,z);
prtmod(theta,din,lab);
% Compute ML estimates, information matrix and print the results
[thopt,it,l,g,H]=e4min('lffast',theta,'',din,z);
[std,corrm,varm,Im]=imod(thopt,din,z);
prtest(thopt,din,lab,z,it,l,g,H,std,corrm);

which yields:
******************** Results from model estimation ********************
Objective function: -9.0059
# of iterations:
33
Information criteria: AIC =
0.1310, SBC =
0.5808
Parameter
FR1(1,1)
FR1(2,2)
FR2(1,1)
FR2(2,2)
FR3(1,1)
FR4(1,1)
AR1(1,1)
AR1(2,1)
AR1(1,2)
AR1(2,2)
&KDS  3DJ 

Estimate
-0.6887
-1.2680
0.5941
0.5593
-0.0682
0.2816
-0.2966
0.6027
-0.8642
-0.8352

Std. Dev.
0.1364
0.1098
0.1403
0.0921
0.1171
0.0856
0.1606
0.0804
0.1387
0.1529

t-test
-5.0510
-11.5433
4.2346
6.0712
-0.5821
3.2894
-1.8468
7.4997
-6.2299
-5.4615

Gradient
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

V(1,1)
V(2,1)
V(2,2)

0.0639
0.0191
0.0423

0.0117
0.0072
0.0078

5.4854
2.6502
5.4491

0.0000
0.0000
0.0000

************************* Correlation matrix **************************


FR1(1,1)
1.00
FR1(2,2)
-0.02 1.00
FR2(1,1)
-0.68 0.34 1.00
FR2(2,2)
-0.23 -0.84 -0.15 1.00
FR3(1,1)
0.39 -0.19 -0.89 0.14 1.00
FR4(1,1)
0.35 0.15 0.35 -0.17 -0.65 1.00
AR1(1,1)
0.70 -0.36 -0.35 0.04 0.07 0.39 1.00
AR1(2,1)
-0.17 0.28 0.39 -0.35 -0.26 0.12 0.09 1.00
AR1(1,2)
-0.51 0.40 0.40 -0.38 -0.19 -0.22 -0.58 0.25 1.00
AR1(2,2)
-0.44 0.69 0.35 -0.45 -0.12 -0.21 -0.70 0.00 0.64 1.00
V(1,1)
0.00 0.03 -0.01 -0.03 0.01 -0.01 0.00 -0.02 0.02 0.02
V(2,1)
-0.01 0.02 0.00 -0.02 0.00 -0.01 -0.01 0.00 0.03 0.03
V(2,2)
0.01 0.01 -0.01 -0.02 0.01 -0.01 0.00 -0.02 0.02 0.02

1.00
0.48
0.12

1.00
0.48

1.00

Condition number = 333.3643


Reciprocal condition number =
0.0025
***********************************************************************

and the code to estimate the second model is:


% Bivariate modelling of the mink-muskrat series
% Model (5.9) of Jenkins and Alavi (1981) pag. 39
e4init
load mink.dat;
z1=transdif(mink(:,3),1,1);
z2=mink(:,2)-mean(mink(:,2));
z=[z1 z2(2:62)];
% Define the parameter matrices and generate the THD representation
phi1= [ 0
0;
0 0];
phi2= [ 0
0;
0 0];
phi3= [NaN NaN; NaN 0];
sigma=[ 0
0;
0 0];
[theta,din,lab]=arma2thd([phi1 phi2 phi3],[],[],[],sigma,1);
% Compute preliminary estimates
theta=e4preest(theta,din,z);
prtmod(theta,din,lab);
% Compute ML estimates, information matrix and print the results
[thopt,it,l,g,H]=e4min('lffast',theta,'',din,z);
[std,corrm,varm,Im]=imod(thopt,din,z);
prtest(thopt,din,lab,z,it,l,g,H,std,corrm);

with the output:


******************** Results from model estimation ********************
Objective function: -3.9049
# of iterations:
21
Information criteria: AIC =
0.2654, SBC =
0.6807
Parameter
FR1(1,1)
FR1(2,1)
FR1(1,2)
FR1(2,2)
FR2(1,1)
FR2(2,1)
FR2(1,2)
FR2(2,2)
FR3(2,2)
V(1,1)
V(2,1)
V(2,2)

Estimate
-0.2485
-0.4247
0.7314
-0.7892
0.1808
0.3000
-0.3696
-0.2818
0.5578
0.0593
0.0179
0.0542

Std. Dev.
0.1175
0.1254
0.1211
0.1194
0.1033
0.1164
0.1509
0.1969
0.1421
0.0108
0.0077
0.0098

t-test
-2.1160
-3.3876
6.0401
-6.6106
1.7502
2.5771
-2.4490
-1.4315
3.9245
5.5058
2.3382
5.5077

Gradient
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

************************* Correlation matrix **************************


FR1(1,1)
1.00
&KDS  3DJ 

FR1(2,1)
FR1(1,2)
FR1(2,2)
FR2(1,1)
FR2(2,1)
FR2(1,2)
FR2(2,2)
FR3(2,2)
V(1,1)
V(2,1)
V(2,2)

0.29
-0.19
-0.06
0.05
0.01
0.55
0.13
0.00
0.01
0.00
0.00

1.00
-0.06
-0.29
0.02
-0.19
0.16
0.66
-0.44
0.01
0.01
0.01

1.00
0.30
-0.38
-0.10
-0.72
-0.16
0.01
0.00
0.01
0.01

1.00
-0.11 1.00
-0.15 0.27 1.00
-0.21 0.43 0.11 1.00
-0.69 0.10 -0.10 0.23 1.00
0.29 0.00 0.53 0.00 -0.69 1.00
0.00 -0.01 -0.01 0.00 0.00 -0.01
0.01 -0.01 -0.01 0.00 0.00 -0.01
0.02 0.00 -0.02 -0.01 0.01 -0.03

1.00
0.42
0.09

1.00
0.42

1.00

Condition number = 37.2248


Reciprocal condition number =
0.0232
***********************************************************************

Comparing the information criteria for both models, one concludes that the VARMA describes the
sample slightly better than the VAR.
E4 includes several functions for model validation and diagnosis. The following code computes the
residuals of the VARMA specification, performs a descriptive multiple autocorrelation analysis and,
finally, displays a standardized plot:
% Validation
[ehat,vT,wT,vz1,vvT,vwT]=residual(thopt,din,z);
tit=str2mat('muskrat residuals','mink residuals');
descser(ehat,tit);
midents(ehat,10,tit);
plotsers(ehat,0,tit);

The residual descriptive statistics are:


*****************

Descriptive statistics

*****************

--- Statistics of muskrat residuals --Valid observations =


61
Mean
=
0.0164, t test =
0.5320
Standard deviation =
0.2412
Skewness
= -0.0707
Excess Kurtosis
= -0.5588
Quartiles
= -0.1659,
0.0122,
0.2052
Minimum value
= -0.5792, obs. #
58
Maximum value
=
0.5555, obs. #
52
Jarque-Bera
=
0.8446
Dickey-Fuller
= -3.1759, computed with
7 lags
Dickey-Fuller
= -7.3740, computed with
1 lags
Outliers list
Obs #
Value
52
0.5555
56
-0.4752
58
-0.5792
--- Statistics of mink residuals --Valid observations =
61
Mean
=
0.0101, t test =
0.3395
Standard deviation =
0.2319
Skewness
= -0.4582
Excess Kurtosis
=
0.1015
Quartiles
= -0.1235,
0.0322,
0.1641
Minimum value
= -0.6061, obs. #
20
Maximum value
=
0.5475, obs. #
35
Jarque-Bera
=
2.1603
Dickey-Fuller
= -2.9948, computed with
7 lags
Dickey-Fuller
= -8.1683, computed with
1 lags
Outliers list
Obs #
Value
20
-0.6061
35
0.5475
49
-0.4905
58
-0.5198
&KDS  3DJ 

Sample correlation matrix


1.0000
0.3414
0.3414
1.0000
Eigen structure of the correlation matrix
i eigenval %var | Eigen vectors
1
1.3414 0.67 |
0.7071 0.7071
2
0.6586 0.33 |
0.7071 -0.7071
************************************************************

The descriptive analysis indicates that the mean of both series is not statistically different of zero and
the empirical distribution of the residuals is consistent with the normality assumption, as the
skewness, excess kurtosis and Jarque-Bera statistics are small. Perhaps there are too many values
exceeding two standard deviations from the mean that should be further investigated. On the other
hand the multiple autocorrelation analysis do not reveal any sign of misspecification except a high
value of the Ljung-Box Q statistic in the (2,2) position, revealing that there could be some
autocorrelation in the mink residuals:
******** Autocorrelation and partial autoregression functions ********
MACF MPARF MACF
MPARF
k = 1, Chi(k) =
3.85, AIC(k) = -5.79, SBC(k) = -5.59
muskrat re .. | .. |
0.03
0.08 | 0.00
0.08
mink resid .. | .. |
0.02 -0.06 | 0.04 -0.07
k = 2, Chi(k) =
3.01, AIC(k) = -5.72, SBC(k) =
muskrat re .. | .. | -0.05 -0.03 | -0.06 -0.01
mink resid .. | .. |
0.05 -0.15 | 0.12 -0.20

-5.37

k = 3, Chi(k) =
7.93, AIC(k) = -5.73, SBC(k) =
muskrat re .. | +. |
0.21 -0.07 | 0.29 -0.16
mink resid .. | .. |
0.21
0.10 | 0.23 -0.03

-5.25

k = 4, Chi(k) =
3.49, AIC(k) = -5.67, SBC(k) =
muskrat re .. | .. | -0.01 -0.07 | 0.01 -0.14
mink resid .. | .. |
0.09 -0.01 | 0.13 -0.11

-5.04

k = 5, Chi(k) =
7.41, AIC(k) = -5.68, SBC(k) =
muskrat re .. | .. | -0.02
0.17 | -0.03
0.12
mink resid .. | +. |
0.19
0.00 | 0.31 -0.09

-4.92

k = 6, Chi(k) = 10.11, AIC(k) = -5.76, SBC(k) =


muskrat re .. | .. | -0.06
0.05 | -0.19
0.11
mink resid .. | .- | -0.09 -0.25 | 0.02 -0.34

-4.86

k = 7, Chi(k) =
7.60, AIC(k) = -5.79, SBC(k) =
muskrat re .. | .. | -0.20 -0.08 | -0.30
0.10
mink resid .. | .. |
0.01
0.09 | 0.08 -0.05

-4.75

k = 8, Chi(k) = 10.06, AIC(k) = -5.89, SBC(k) =


muskrat re .. | .. | -0.01 -0.15 | 0.10 -0.29
mink resid .. | .. | -0.13 -0.12 | -0.21 -0.30

-4.71

k = 9, Chi(k) =
4.58, AIC(k) = -5.86, SBC(k) =
muskrat re .+ | .. |
0.04
0.27 | -0.05
0.25
mink resid .. | .. |
0.00
0.15 | 0.12
0.18

-4.55

k = 10, Chi(k) = 17.30, AIC(k) = -6.16, SBC(k) =


muskrat re .. | .. | -0.24 -0.01 | -0.44
0.03
mink resid .+ | .+ |
0.14
0.36 | 0.07
0.25

-4.70

The (i,j) element of the lag k matrix is the cross correlation (MACF)
or partial autoregression (MPACF) estimate when series j leads series i.
********************* Cross correlation functions *********************
muskrat re mink resid
muskrat re .......... ........+.
mink resid .......... .........+
&KDS  3DJ 

Each row is the cross correlation function when the column variable leads
the row variable.
Ljung-Box Q statistic for previous cross-correlations
muskrat re mink resid
muskrat re
10.73
10.76
mink resid
9.28
19.83
Summary in terms of +.For MACF std. deviations are computed as 1/T^0.5 =
0.13
For MPARF std. deviations are computed from VAR(k) model
***********************************************************************

Last, the time-series plots of the residuals are again quite satisfactory.
Standardized plot of mink residuals

Standardized plot of muskrat residuals

2.5
2

1.5

1.5
1

0.5

0.5

-0.5

-0.5

-1

-1

-1.5

-1.5

-2

-2

-2.5
10

20

30

40

50

60

10

20

30

40

50

60

Further elaborations of this example may arise from inspection of the impulse-response functions of
both variables, or by using these results to build a discrete time analogue of the Lotka-Volterra
equations, as Jenkins and Alavi suggest.
The code and data required to replicate this case can be found in the directory
\EXAMPLES\MINKS of the distribution diskette, files mink.dat, mink1.m and mink2.m

Transfer function analysis


Very often, relevant characteristics of the dynamic behaviour of a time series are not measured by
explicit parameters of a standard model, but by functions of these parameters. In this example we
show the application of user functions to reparametrize a transfer function. Hence, an implicit
parameter becomes explicit and can be subject to standard estimation and testing processes.

&KDS  3DJ 

8QFRQVWUDLQHG WUDQVIHU IXQFWLRQ PRGHOLQJ


McLeod (1982) builds a transfer function model relating consumption of petrochemical products
with a seasonally adjusted UK industrial production index, using data from 1958 first quarter to
1976 fourth quarter. The model is:
ln PC t

74

w10 ln IPt  (w20  w21 B  w22 B 2 ) !IV/


 Nt
t

(1  01 B 4  02 B 8 ) //4 Nt

(7.5)

( 1  1 B ) ( 1  1 B 4 ) a t

where PC denotes consumption of petrochemical products, IP is the industrial production index, and
!IVt /74 is an impulse variable which models the beginning of a period of destocking in the 4th quarter
of 1974. The code to estimate this model is:
e4init
load petro.dat;
petro(:,1:2) = log(petro(:,1:2));
y = transdif(petro,1,1,1,4);
% Defines the structure of the transfer function
sar = [0 0]; ma = [0]; sma = [0]; v = [0];
w
= [0 NaN NaN; 0 0 0];
[theta, din, lab] = tf2thd([],[sar],[ma],[sma],[v],4,[w],[]);
% Computes preliminary estimates
theta=e4preest(theta,din,y);
prtmod(theta,din,lab);
% ... and ML estimates
[thopt,it,lval,g,h]=e4min('lffast', theta, '', din, y);
[std,corrm,varm,Im]=imod(thopt,din,y);
prtest(thopt,din,lab,y,it,lval,g,h,std,corrm);
% Computes the period of the noise model
period = 2*pi/acos(-thopt(1,1)/(2*sqrt(thopt(2,1))));
disp(sprintf('period = %4.2f years', period));

Note that the last sentences compute and display the period for the seasonal AR(2) noise process.
This code yields the following output:
******************** Results from model estimation ********************
Objective function: -173.6745
# of iterations:
48
Information criteria: AIC = -4.6387, SBC = -4.3519
Parameter
FS(1,1)
FS(1,2)
AR(1,1)
AS(1,1)
W1(1,1)
W2(1,1)
W2(2,1)
W2(3,1)
V(1,1)

Estimate
0.1521
0.2660
-0.5450
-0.7781
1.3966
-0.0350
-0.1150
-0.0478
0.0200

Std. Dev.
0.1307
0.1222
0.0994
0.1201
0.1089
0.0182
0.0187
0.0185
0.0017

t-test
1.1634
2.1768
-5.4824
-6.4776
12.8191
-1.9299
-6.1517
-2.5888
11.7627

Gradient
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

************************* Correlation matrix **************************


FS(1,1)
1.00
FS(1,2)
0.34 1.00
AR(1,1)
0.06 0.00 1.00
AS(1,1)
0.52 0.42 -0.05 1.00
W1(1,1)
-0.03 -0.01 0.00 -0.02 1.00
W2(1,1)
0.00 0.00 0.00 0.00 -0.01 1.00
&KDS  3DJ 

W2(2,1)
W2(3,1)
V(1,1)

0.01 0.01
-0.01 0.00
0.01 -0.02

0.00
0.00
0.01

0.01 -0.14
0.00 0.18
0.12 0.00

0.25
0.17
0.00

1.00
0.22
0.00

1.00
0.00

1.00

Condition number =
4.2496
Reciprocal condition number =
0.2787
***********************************************************************
period = 3.66 years

A standard validation output can be obtained with the following sentences:


% Validation
[ehat,vT,wT,vz1,vvT,vwT]=residual(thopt,din,y);
descser(ehat,'petro. consumption residuals');
uidents(ehat,10,'petro. consumption residuals');
plotsers(ehat,0,'petro. consumption residuals');

which generate the following outputs:


*****************

Descriptive statistics

*****************

--- Statistics of petro. consumption residuals --Valid observations =


71
Mean
= -0.0007, t test = -0.3021
Standard deviation =
0.0201
Skewness
= -0.3747
Excess Kurtosis
= -0.1037
Quartiles
= -0.0134,
0.0000,
0.0123
Minimum value
= -0.0522, obs. #
41
Maximum value
=
0.0388, obs. #
36
Jarque-Bera
=
1.6931
Dickey-Fuller
= -3.4079, computed with
8 lags
Dickey-Fuller
= -8.6135, computed with
1 lags
Outliers list
Obs #
Value
41
-0.0522
49
-0.0512
69
-0.0431
************************************************************
A.C.F. of petro. consumption residuals, LBQ = 6.25
1

Standardized plot of petro. consumption residuals


2.5
2

0.5

1.5

1
-0.5
0.5
-1

3
4
5
6
7
8
P.A.C.F. of petro. consumption residuals

0
-0.5

-1

0.5

-1.5

-2
-0.5
-2.5
-1

10
1

20

30

40

50

60

70

These results do not suggest any alternative specification, so the model appears to be statistically
adequate.

&KDS  3DJ 

3HULRG HVWLPDWLRQ
Previous estimates imply that the autoregressive factor of the noise model is periodic, with a cycle of
3.66 years. With E4 a simple reparametrization allows one to estimate this period, its standard
deviation and, if required, to constrain its value, see Terceiro and Gmez (1985). To do this, it is

first necessary to build a user function that will receive the period in theta, compute 02 and return
the SS formulation of the model. The relationship that links 02 with the values of 01 and the period
(p) is:

02

01

2 cos ( 2% / p)

(7.6)

The user function that does the transformation is:


function [Phi, Gam, E, H, D, C, Q, S, R] = pcons1(thetan, dinn);
% SS formulation with the period as explicit parameter.
% It is stored in theta(2,1).
theta = thetan;
theta(2,1) = (-theta(1,1)/2*cos(2*pi/thetan(2,1)))^2;
din
= tomod(dinn);
[Phi, Gam, E, H, D, C, Q, S, R]= thd2ss(theta,din);

and the previous command file needs the following addition:


% Explicit period estimation
thetan = thopt;
thetan(2,1) = 3.7;
labn = str2mat(lab(1,:),'Period',lab(3:size(lab,1),:) );
dinn = touser(din,'pcons1');
[theta,it,lval,g,h]=e4min('lffast',thetan,'',dinn,y);
prtest(theta,din,labn,y,it,lval,g,h);

The corresponding output is:


******************** Results from model estimation ********************
Objective function: -173.6745
# of iterations:
34
Information criteria: AIC = -4.6387, SBC = -4.3519
Parameter
FS(1,1)
Period
AR(1,1)
AS(1,1)
W1(1,1)
W2(1,1)
W2(2,1)
W2(3,1)
V(1,1)

Estimate
0.1521
3.6556
-0.5450
-0.7781
1.3966
-0.0350
-0.1150
-0.0478
0.0200

Appr.Std.Dev.
0.1360
0.2958
0.1291
0.1108
0.1252
0.0183
0.0185
0.0189
0.0016

t-test
1.1179
12.3579
-4.2218
-7.0221
11.1538
-1.9136
-6.2061
-2.5291
12.1620

Gradient
0.0000
0.0000
0.0000
0.0000
0.0000
0.0002
-0.0001
0.0003
0.0028

************************* Correlation matrix **************************


FS(1,1)
1.00
Period
-0.96 1.00
AR(1,1)
0.45 -0.54 1.00
AS(1,1)
0.40 -0.31 0.09 1.00
W1(1,1)
-0.34 0.43 -0.47 -0.11 1.00
W2(1,1)
0.18 -0.26 0.25 -0.04 -0.21 1.00
W2(2,1)
0.29 -0.35 0.31 -0.07 -0.31 0.39 1.00
&KDS  3DJ 

W2(3,1)
V(1,1)

-0.17 0.21 -0.11


0.05 -0.05 0.06

0.28 0.20
0.20 -0.05

0.08
0.03

0.03
0.02

1.00
0.09

1.00

Condition number = 137.4069


Reciprocal condition number =
0.0060
***********************************************************************

Note that the standard error for the period is 0.3 years. Not surprisingly, the point estimate of the
period and the optimal value of the objective function are equal to those obtained from the standard
representation.

(VWLPDWLRQ RI D FRQVWUDLQHG WUDQVIHU IXQFWLRQ


With the last parametrization it is straightforward to impose any value of the period. If extraneous
information suggests that the period is exactly equal to three years, this constraint can be imposed
with the following code:
thetan=theta(:,1);
thetan(2,1)=3;
thetan=[thetan, [0; 1; 0; 0; 0; 0; 0; 0; 0]];
dinn=touser(din,'pcons1');
[theta,it,lval,g,h]=e4min('lffast',thetan,'',dinn,y);
prtest(theta,din,labn,y,it,lval,g,h);

which yields:
******************** Results from model estimation ********************
Objective function: -172.2959
# of iterations:
33
Information criteria: AIC = -4.6281, SBC = -4.3731
Parameter
Estimate
Appr.Std.Dev.
FS(1,1)
0.2097
0.1876
Period
*
3.0000
0.0000
AR(1,1)
-0.4542
0.0960
AS(1,1)
-0.8419
0.1354
W1(1,1)
1.3176
0.1054
W2(1,1)
-0.0236
0.0179
W2(2,1)
-0.1054
0.0188
W2(3,1)
-0.0548
0.0192
V(1,1)
0.0204
0.0017
* denotes constrained parameter
************************* Correlation
FS(1,1)
1.00
AR(1,1)
0.24 1.00
AS(1,1)
0.63 0.25 1.00
W1(1,1)
-0.10 -0.23 -0.15 1.00
W2(1,1)
-0.06 0.04 -0.06 0.03
W2(2,1)
0.21 0.20 0.03 -0.08
W2(3,1)
-0.13 0.12 0.23 0.12
V(1,1)
0.02 -0.01 0.23 -0.06

t-test
1.1182
0.0000
-4.7311
-6.2185
12.5022
-1.3182
-5.6168
-2.8483
12.1003

Gradient
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0001

matrix **************************

1.00
0.34 1.00
0.23 0.22
0.01 -0.06

1.00
0.07

1.00

Condition number =
9.5355
Reciprocal condition number =
0.0925
***********************************************************************

&KDS  3DJ 

&RPSRVLWH PRGHO IRUHFDVWV


To compute forecasts for the endogenous variable of a transfer function - here consumption of
petrochemical products - it is common to use a two stage approximation. The first stage consist of
computing forecasts for the inputs, often with univariate models. The second stage consists of
forecasting the endogenous variable by feeding the transfer function with the first stage forecasts.
This procedure ignores the stochastic nature of the input forecasts and, therefore, underestimates the
standard error of the output forecasts.
With E4 it is not necessary to proceed in this way, as it is possible to compute simultaneous forecasts
for both variables. To see this, consider the input model proposed by McLeod (1982):
(1  1 B ) ( 1  0 B 4 ) / lnIPt

at

(7.7)

To formulate a simultaneous model for the input and output, we first have to compensate the
additional seasonal difference in the transfer function. This can be done by adding to
the univariate model a seasonal difference and a seasonal moving average, with a parameter equal to
one, that is:
(7.8)
(1  1 B ) ( 1  0 B 4 ) //4 lnIPt
( 1 B 4 ) at
We can estimate this model with the code:
e4init;
load petro.dat;
petro = log(petro(:,1:2));
% Filters deterministic effects using the estimates of model (7.5)
petro(68,1)=petro(68,1)+.0350;
petro(69,1)=petro(69,1)+.1150;
petro(70,1)=petro(70,1)+.0478;
y = transdif(petro,1,1,1,4);
% Input model definition and estimation
[t2, d2, l2] = arma2thd([0],[0],[],[0],[0],4);
t2(3,1)=-1.;
t2 = [t2 [0; 0; 1; 0]];
t2=e4preest(t2,d2,y(:,2));
[t2,it,lval,g,h]=e4min('lffast',t2, '',d2,y(:,2));
prtest(t2,d2,l2,y(:,2),it,lval,g,h)

which yields the following output:


******************** Results from model estimation ********************
Objective function: -177.9523
# of iterations:
17
Information criteria: AIC = -4.9282, SBC = -4.8326
Parameter
Estimate
Appr.Std.Dev.
FR1(1,1)
-0.2124
0.1187
FS1(1,1)
0.1565
0.1204
AS1(1,1)
*
-1.0000
0.0000
V(1,1)
0.0180
0.0015
* denotes constrained parameter

t-test
-1.7896
1.3006
0.0000
11.9012

Gradient
0.0000
0.0000
0.0000
-0.0004

&KDS  3DJ 

************************* Correlation matrix **************************


FR1(1,1)
1.00
FS1(1,1)
-0.20 1.00
V(1,1)
0.03 -0.08 1.00
Condition number =
1.5466
Reciprocal condition number =
0.6364
***********************************************************************

Note that the first part of the code filters out the deterministic effect !t
, see Eq. (7.5). We define
the composite model using as starting values the estimates contained in t1 (for the transfer function)
and in t2 (for the input model):
IV /74

% TF definition and estimation


sar = [0 0]; ma = [0]; sma = [0]; v = [0];
w
= [0];
[t1, d1, l1] = tf2thd([],[sar],[ma],[sma],[v],4,[w],[]);
t1=e4preest(t1,d1,y);
[t1,it,lval,g,h]=e4min('lffast',t1,'',d1,y);
prtest(t1,d1,l1,y,it,lval,g,h)
% Composite model formulation
[theta,din,lab]=stackthd(t1,d1,t2,d2,l1,l2);
[theta,din,lab] = nest2thd(theta,din,1,lab);
prtmod(theta,din,lab);

which yields:
*************************** Model ***************************
Nested model in inputs (innovations model)
2 endogenous v., 0 exogenous v.
Seasonality: 4
SS vector dimension: 13
Submodels:
{
Transfer function model (innovations model)
1 endogenous v., 1 exogenous v.
Seasonality: 4
SS vector dimension: 8
Parameters (* denotes constrained parameter):
FS(1,1)
0.1520
FS(1,2)
0.2660
AR(1,1)
-0.5450
AS(1,1)
-0.7781
W1(1,1)
1.3966
V(1,1)
0.0200
-------------VARMAX model (innovations model)
1 endogenous v., 0 exogenous v.
Seasonality: 4
SS vector dimension: 5
Parameters (* denotes constrained parameter):
FR1(1,1)
-0.2124
FS1(1,1)
0.1565
AS1(1,1)
*
-1.0000
V(1,1)
0.0180
-------------}
*************************************************************

To obtain forecasts for the stationary variables two different methods are compared. The standard
one, in which the forecasts for the input variable are treated as known to forecast the output, and the
method used in the toolbox. Using a composite model simultaneous forecasts can be computed for

&KDS  3DJ 

both the input and the output and the variances are computed accordingly. These calculations can be
done with the following code:
% Forecasting. First, conventional approach ...
[xf,Bfx] = foremod(t2,d2,y(:,2),8);
[yf1,Bfy1] = foremod(t1,d1,y,8,xf);
[yf1 xf sqrt(Bfy1) sqrt(Bfx)]
% ... and then, composite model forecasts
[yf,Bf] = foremod(theta,din,y,8);
[yf sqrt([Bf(1:2:2*8,1) Bf(2:2:2*8,2)])]

The conventional approach forecasts for the output and the input, with the corresponding standard
errors, are:
ans =
-0.0350
0.0436
-0.0189
0.0250
-0.0140
0.0071
-0.0090
0.0088

-0.0123
-0.0068
0.0058
-0.0015
0.0016
0.0010
-0.0009
0.0002

0.0200
0.0227
0.0227
0.0227
0.0294
0.0311
0.0311
0.0311

0.0185
0.0189
0.0189
0.0189
0.0278
0.0281
0.0282
0.0282

and the forecasts computed with the composite model are:


ans =
-0.0350
0.0436
-0.0189
0.0250
-0.0140
0.0071
-0.0090
0.0088

-0.0123
-0.0068
0.0058
-0.0015
0.0016
0.0010
-0.0009
0.0002

0.0327
0.0348
0.0349
0.0349
0.0487
0.0501
0.0501
0.0501

0.0185
0.0189
0.0189
0.0189
0.0278
0.0281
0.0282
0.0282

As expected, the forecast standard errors of the endogenous variable are higher when computed
using the composite model. The other results are the same.
The code and data required to replicate this case can be found in the directory
\EXAMPLES\PETROL of the distribution diskette, files petro.dat, petrol1.m, petrol2.m
and pcons1.m.

Structural econometric models: supply and demand of food


Kmenta (1997) proposes a simple supply-demand model to explain the consumption and prices of
food, inspired in previous work by Girshick and Haavelmo (1947). In this section we use this model

&KDS  3DJ 

to illustrate the application of E4 to structural (str) formulations. The model includes two
behavioural equations:
Q t
1  2 Pt  3 Dt  u1t
Q t
1  2 Pt  3 Ft  4 At  u2t

Demand:
Supply:

(7.9)
(7.10)

The endogenous variables are Q t , food consumption per head, and Pt , ratio of food prices to general
consumer prices. On the other hand, the exogenous variables are the constant term, the disposable
income in constant prices D t , the ratio of preceding years prices received by farmers to general
consumer prices Ft and time A t . The sample includes 20 yearly observations for all the variables,
and was taken from Kmenta (1997).

0D[LPXPOLNHOLKRRG HVWLPDWLRQ

The structural model (7.9)-(7.10) in matrix notation is:


1
1

2 Qt

2 Pt

1 3 0 0 Dt

1 0 3 4 Ft

u1t

(7.11)

u2t

At
V

u1t
u2t

v1 c12

(7.12)

c12 v2

Note that the left-hand-side matrix in (7.11) is not normalized, therefore we cannot use str2thd to
define and estimate the structural form. Instead we will compute the likelihood using the
observationally equivalent reduced form:
1
Qt
Pt

2

2

1 3 0 0 D t
1 0 3 4 Ft

2

2

u1t
u2t

(7.13)

At
This reparametrization is implemented by means of a user function (food2ss) which: a) receives as
input a parameter vector th that contains parameters in (7.11) and the covariances in (7.12), b)
generates the formulation (7.13) in thd format and c) calls th2sss to obtain the corresponding SS
model matrices. The code to define this function is the following:
function [Phi, Gam, E, H, D, C, Q, S, R] = food2ss(th, din)
% Returns the SS representation of Kmentas model
% th(1:7) contains the structural parameters
% th(8:10) contains the lower triangle of the noise covariance matrix
&KDS  3DJ 

t = th(:,1);
F0 = [1 -th(2); 1 -th(5)];
G0 = [th(1) th(3) 0 0; th(4) 0 th(6) th(7)];
V = vech2m(th(8:10),2);
% THD formulation of structural model
iF0 = diag(1./diag(F0)); % Normalizes the main diagonal elements
[theta,din] = str2thd(iF0*F0,[],[],[],iF0*V*iF0',1,iF0*G0,4);
% SS conversion
[Phi, Gam, E, H, D, C, Q, S, R] = thd2sss(theta,din);

Since the model includes many parameters and the sample is very short, the degrees of freedom are
not enough to use e4preest. Hence, we use 2SLS estimates as initial conditions for likelihood
optimization. Under these conditions, the model can be estimated with the following code:
% Model for the supply and demand of food from Kmenta (1986)
e4init
load food.dat
Q = food(:,1); P = food(:,2);
D = food(:,3);
F = food(:,4); A = food(:,5); cte = food(:,6);
z = [Q P cte D F A];
% 2SLS estimates
t = [ 95; -0.24; 0.31; ...
49; 0.24; 0.5; 0.25; 3.1; 1.7; 4.6];
% Model formulation
F0 = [1 -t(2);
1 -t(5)];
G0 = [t(1), t(3),
0,
0; ...
t(4),
0 , t(6), t(7)];
V = vech2m(t(8:10),2);
lab = str2mat(
'alpha1','alpha2','alpha3');
lab = str2mat(lab,'beta1', 'beta2', 'beta3', 'beta4');
lab = str2mat(lab, 'v1', 'c12', 'v2');
[tdum,din] = str2thd([F0],[],[],[],V,1,[G0],4);
din = touser(din,'food2ss');
[p,iter,lnew,g,h] = e4min('lffast',t,'',din,z);
prtest(p,din,lab,z,iter,lnew,g,h);
[e, vT, wT, Ve, VvT, VwT]=residual(p,din,z,1);
tit=['demand residuals';
'supply residuals'];
midents(e,5,tit);

The output from prtest is:


******************** Results from model estimation ********************
Objective function: 67.7697
# of iterations:
69
Information criteria: AIC =
7.7770, SBC =
8.2748
Parameter
alpha1
alpha2
alpha3
beta1
beta2
beta3
beta4
v1
c12
v2

Estimate
93.2067
-0.2253
0.3098
51.4429
0.2425
0.2207
0.3695
3.3465
4.2680
5.6395

Appr.Std.Dev.
7.1386
0.0863
0.0385
10.2259
0.0908
0.0352
0.0604
1.1199
1.4227
1.8489

t-test
13.0567
-2.6100
8.0462
5.0307
2.6718
6.2605
6.1189
2.9881
2.9999
3.0502

Gradient
0.0000
0.0017
0.0017
0.0000
-0.0002
-0.0006
0.0002
0.0000
0.0000
0.0000

************************* Correlation matrix **************************


alpha1
1.00
alpha2
-0.90 1.00
alpha3
0.18 -0.58 1.00
&KDS  3DJ 

beta1
beta2
beta3
beta4
v1
c12
v2

0.76 -0.43 -0.46 1.00


-0.94 0.74 0.09 -0.92
0.19 -0.57 0.97 -0.46
0.16 -0.53 0.94 -0.46
-0.28 0.32 -0.20 -0.13
-0.28 0.28 -0.11 -0.19
-0.27 0.22 0.02 -0.25

1.00
0.09 1.00
0.10 0.93 1.00
0.23 -0.20 -0.17
0.26 -0.11 -0.08
0.28 0.00 0.06

1.00
0.99
0.95

1.00
0.98

1.00

Condition number = 158565.5617


Reciprocal condition number =
0.0000
***********************************************************************

and the multiple autocorrelation function of the residuals is:


******** Autocorrelation and partial autoregression functions ********
MACF MPARF MACF
MPARF
k = 1, Chi(k) =
0.81, AIC(k) =
0.52, SBC(k) =
0.82
demand res .. | .. | -0.10 -0.18 | -0.10 -0.19
supply res .. | .. | -0.05
0.02 | -0.05
0.02
k = 2, Chi(k) =
4.10, AIC(k) =
0.66, SBC(k) =
demand res .. | .. | -0.03 -0.04 | -0.05 -0.05
supply res .. | .. |
0.15 -0.09 | 0.14 -0.12

1.16

k = 3, Chi(k) =
6.17, AIC(k) =
0.60, SBC(k) =
demand res .. | .. | -0.17
0.23 | -0.02
0.34
supply res .. | .. | -0.13 -0.11 | -0.30 -0.21

1.30

k = 4, Chi(k) = 17.01, AIC(k) = -0.48, SBC(k) =


demand res .. | .. |
0.22 -0.22 | 0.25 -0.31
supply res -. | -. | -0.51 -0.20 | -0.89 -0.73

0.42

k = 5, Chi(k) =
6.55, AIC(k) = -0.77, SBC(k) =
demand res .. | ++ |
0.22
0.33 | 1.20
1.40
supply res .. | .. |
0.09 -0.03 | 0.01
0.01

0.33

The (i,j) element of the lag k matrix is the cross correlation (MACF)
or partial autoregression (MPACF) estimate when series j leads series i.
********************* Cross correlation functions *********************
demand res supply res
demand res .....
.....
supply res ...-.
.....
Each row is the cross correlation function when the column variable leads
the row variable.
Ljung-Box Q statistic for previous cross-correlations
demand res supply res
demand res
3.77
6.65
supply res
8.46
1.65
Summary in terms of +.For MACF std. deviations are computed as 1/T^0.5 =
0.22
For MPARF std. deviations are computed from VAR(k) model
***********************************************************************

The code and data required to replicate this case can be found in the directory \EXAMPLES\FOOD
of the distribution diskette, files food.dat, food.m and food2ss.m.

&KDS  3DJ 

An ARCH model for the U.S. GNP deflator


In most econometric software ARCH modeling options only allow a regression model for the
(conditional) mean and a univariate GARCH process for the variance. In contrast, E4 allows one to
combine transfer function or VARMAX models for the mean with vector ARCH, GARCH or
IGARCH models for its conditional variance.
In this example we illustrate the use of ARCH modeling functions with a model for the quarterly
U.S. implicit price deflator of GNP, from 1948:II to 1983:IV. This series has been analyzed by
Engle and Kraft (1983) and Bollerslev (1986). These authors model the (conditional) mean of the
deflator by an AR(4) process:

%t
0  1 %t 1  2 %t 2  3 %t 3  4 %t 4  Jt

(7.18)

where %t
100 ln( Pt / Pt 1 ) and Pt is the GNP deflator. The difference between papers lies in the
parametric model assumed for the conditional variance. Engle and Kraft (1983) consider that Jt
follows a constrained ARCH(8) process and Bollerslev (1986) assumes a GARCH(1,1) process.

(VWLPDWLRQ XQGHU KRPRVFHGDVWLFLW\


The first step in this analysis will be to estimate the AR(4) model for inflation under the assumption
2
of homoscedasticity, that is E ( Jt )
)2 , ~t . To do this, we can use the following code:
e4init;
load gnpn.dat;
y = gnpn; c = ones(size(y));
% Defines the AR(4) structure and computes preliminary estimates ...
[theta, din, lab] = arma2thd([0 0 0 0],[],[],[],0,4,0,1);
theta=e4preest(theta,din,[y c]);
% ... and then, computes ML estimates under homoscedasticity
[theta,it,lval,g,h]=e4min('lffast',theta,'', din, [y c]);
[std, corrm, varm, Im]= imod(theta, din, [y c]);
prtest(theta,din,lab,[y c],it,lval,g,h,std,corrm);
% Finally, computes the residuals and its squares, to detect ARCH effects
[ehat,vT,wT,vz1,vvT,vwT]=residual(theta,din,[y c]);
plotsers(ehat,0,'AR(4) residuals');
uidents(ehat,15,'AR(4) residuals');
uidents(ehat.^2,15,'AR(4) squared residuals');

which yields:

&KDS  3DJ 

******************** Results from model estimation ********************


Objective function: 111.6734
# of iterations:
40
Information criteria: AIC =
1.6458, SBC =
1.7701
Parameter
FR1(1,1)
FR2(1,1)
FR3(1,1)
FR4(1,1)
G0(1,1)
V(1,1)

Estimate
-0.5282
-0.1925
-0.2016
0.1506
0.2315
0.2773

Std. Dev.
0.0827
0.0931
0.0932
0.0838
0.0794
0.0328

t-test
-6.3834
-2.0681
-2.1643
1.7975
2.9150
8.4549

Gradient
0.0000
0.0000
0.0000
0.0000
0.0000
0.0001

************************* Correlation matrix **************************


FR1(1,1)
1.00
FR2(1,1)
-0.46 1.00
FR3(1,1)
-0.15 -0.36 1.00
FR4(1,1)
-0.13 -0.15 -0.46 1.00
G0(1,1)
0.20 0.11 0.12 0.21 1.00
V(1,1)
0.01 0.00 0.01 -0.01 0.01 1.00
Condition number = 42.9291
Reciprocal condition number =
0.0177
***********************************************************************

with the following diagnostic graphs:


Standardized plot of AR(4) residuals
4
3
2
1
0
-1
-2
-3
-4
20

40

60

80

100

A.C.F. of AR(4) residuals, LBQ = 18.84


1

0.5

0.5

-0.5

-0.5
2

6
8
10
P.A.C.F. of AR(4) residuals

12

-1

14

0.5

0.5

-0.5

-0.5

-1

10

12

140

A.C.F. of AR(4) squared residuals, LBQ = 64.72

-1

120

14

-1

4
6
8
10
12
P.A.C.F. of AR(4) squared residuals

14

14

10

12

The residuals show a long-memory structure in the autocorrelation function of the squared residuals.
This is a symptom of conditional heteroscedasticity.
&KDS  3DJ 

(VWLPDWLRQ RI DQ $5&+  SURFHVV IRU WKH HUURU

Engle and Kraft (1983) try to capture this last feature assuming that the AR(4) has a conditional
2
2
2
heteroscedastic error term, such that E( Jt )
)2 , Et 1( Jt )
)t and:

)2t
)2  N t
Nt

1

M
8

i
1

9 i
Nt i
36

(7.19)

 vt

Note that the terms in the ARCH model are actually a declining function of a single parameter. To
obtain the SS formulation of the constrained ARCH model we need to define a user function:
function[Phi,Gam,E,H,D,C,Q,Phig,Gamg,Eg,Hg,Dg] = arch8(thetan,dinn);
% User function for constrained ARCH(8) model
theta = zeros(14,1);
theta(1:6) = thetan(1:6,1);
alpha1
= thetan(7,1);
for i=1:8
theta(i+6) = alpha1*(9-i)/36;
end
din= tomod(dinn);
[Phi Gam E H D C Q Phig Gamg Eg Hg Dg]= garch2ss(theta,din);

and then estimates can be obtained with the following code:


% Model: AR(4)+constrained ARCH(8). Implicit price deflator for GNP data.
e4init;
load gnpn.dat;
y = gnpn;
c = ones(size(y));
% Model for the mean
[thetay, diny, lab1] = arma2thd([0 0 0 0],[],[],[],[0],4,[0],1);
% Model for the conditional variance
[thetae, dine, lab2] = arma2thd([0 0 0 0 0 0 0 0],[],[],[],1,4);
% Defines the composite model
[theta, din, lab3] = garc2thd(thetay, diny, thetae, dine, lab1, lab2);
% Computes preliminary estimates
theta=e4preest(theta,din,[y c]);
prtmod(theta, din, lab3);
thetan = theta(1:7,1);
labn
= lab3(1:7,:);
dinn = touser(din,'arch8');
% ... and ML estimates. Note the user function in the call to e4min
[thopt,it,lval,g,h]=e4min('lfgarch',thetan,'', dinn, [y c]);
prtest(thopt,dinn,labn,[y c],it,lval,g,h,[],[]);

The estimation results are:

&KDS  3DJ 

******************** Results from model estimation ********************


Objective function: 89.8282
# of iterations:
74
Information criteria: AIC =
1.3542, SBC =
1.4993
Parameter
FR1(1,1)
FR2(1,1)
FR3(1,1)
FR4(1,1)
G0(1,1)
V(1,1)
FR1(1,1)

Estimate
-0.4021
-0.1959
-0.3698
0.1241
0.1365
1.1816
-0.9635

Appr.Std.Dev.
t-test
0.0829
-4.8506
0.0826
-2.3701
0.0887
-4.1705
0.0896
1.3849
0.0551
2.4772
1.1001
1.0741
0.0406
-23.7462

Gradient
0.0001
-0.0001
0.0001
0.0001
-0.0001
0.0000
0.0003

************************* Correlation matrix **************************


FR1(1,1)
1.00
FR2(1,1)
-0.34 1.00
FR3(1,1)
-0.08 -0.42 1.00
FR4(1,1)
-0.41 -0.13 -0.42 1.00
G0(1,1)
0.12 0.11 0.13 0.14 1.00
V(1,1)
0.00 0.02 0.02 -0.04 0.07 1.00
FR1(1,1)
-0.03 -0.02 0.04 -0.01 -0.05 -0.92 1.00
Condition number = 70.9489
Reciprocal condition number =
0.0131
***********************************************************************

(VWLPDWLRQ RI D *$5&+  SURFHVV IRU WKH HUURU

A more sophisticated parametrization for the conditional variance is that of Bollerslev (1986), who
proposes the GARCH(1,1) model:

)2t
)2  N t
Nt

1 Nt 1  vt 1 vt 1

(7.20)

The code needed to estimate the mean and variance models and to perform a validation analysis is:
% Model: AR(4)+GARCH(1,1). Implicit price deflator for GNP data.
e4init;
load gnpn.dat;
y = gnpn;
c = ones(size(y));
% Model for the mean
[thetay, diny, lab1] = arma2thd([0 0 0 0],[],[],[],[0],4,[0],1);
% Model for the conditional variance
[thetae, dine, lab2] = arma2thd([0],[],[0],[],1,4);
% Composite model
[theta, din, lab3] = garc2thd(thetay, diny, thetae, dine, lab1, lab2);
% Computes preliminary estimates
theta=e4preest(theta,din,[y c]);
prtmod(theta, din, lab3);
% ... and ML estimates
[thopt,it,lval,g,h]=e4min('lfgarch',theta,'', din, [y c]);
[std,corrm,varm,Im]=igarch(thopt,din,[y c]);
prtest(thopt,din,lab3,[y c],it,lval,g,h,std,corrm);
% Validation. Note that residual.m only returns 'ehat' and 'vz1'
[ehat,vT,wT,vz1]=residual(thopt,din,[y c]);
figure; whitebg('w');
plot(vz1);
title('Conditional variance')
plotsers(ehat,0,'original residuals');
stdres=ehat./sqrt(vz1)
plotsers(stdres,0,'standardized residuals');
uidents(stdres,15,'standardized residuals');
uidents(stdres.^2,15,'standardized squared residuals');
&KDS  3DJ 

which yields the following results:


****************** Results from model estimation ********************
Objective function: 87.5947
# of iterations:
65
Information criteria: AIC =
1.3370, SBC =
1.5027
Parameter
FR1(1,1)
FR2(1,1)
FR3(1,1)
FR4(1,1)
G0(1,1)
V(1,1)
FR1(1,1)
AR1(1,1)

Estimate
-0.4186
-0.2090
-0.3295
0.1098
0.1441
1.5613
-0.9982
-0.8373

Std. Dev.
0.0927
0.0951
0.0934
0.0889
0.0617
1.0813
0.0025
0.0462

t-test
-4.5158
-2.1963
-3.5282
1.2351
2.3365
1.4440
-396.3847
-18.1330

Gradient
-0.0002
-0.0013
-0.0004
-0.0008
-0.0002
-0.0001
-0.0029
-0.0002

************************* Correlation matrix **************************


FR1(1,1)
1.00
FR2(1,1)
-0.43 1.00
FR3(1,1)
-0.19 -0.34 1.00
FR4(1,1)
-0.29 -0.19 -0.38 1.00
G0(1,1)
0.18 0.10 0.11 0.11 1.00
V(1,1)
-0.04 0.03 0.10 -0.03 0.10 1.00
FR1(1,1)
0.04 -0.01 -0.13 0.02 -0.11 -0.50 1.00
AR1(1,1)
-0.04 0.01 -0.08 0.07 -0.01 0.11 0.42 1.00
Condition number = 85.6653
Reciprocal condition number =
0.0087
***********************************************************************

Note that the estimates of the AR(4) model of the mean, the likelihood function and the information
criteria are very similar to that of the constrained ARCH(8). This evidence indicates that both
representations are very similar.
Standardized plot of original residuals

Standardized plot of standardized residuals

2
1

1
0

-1

-1

-2
-2

-3
-4

-3
20

40

60

80

100

120

140

20

40

60

80

100

120

140

&KDS  3DJ 

A.C.F. of standardized residuals, LBQ = 8.26

A.C.F. of standardized squared residuals, LBQ = 15.06

0.5

0.5

-0.5

-0.5

-1

6
8
10
12
P.A.C.F. of standardized residuals

14

-1

0.5

0.5

-0.5

-0.5

-1

10

12

14

-1

4
6
8
10
12
P.A.C.F. of standardized squared residuals

10

12

14

14

The validation output do not show any evidence of misspecification. However, note that the plot of
the standardized residuals shows some outliers which are not due to heteroscedasticity, as the
autocorrelation function of the standardized residuals indicates.
Finally, the following plot shows the time-varying conditional variances implied by this model:
Conditional variance
2.5

1.5

0.5

50

100

150

The code and data required to replicate this case can be found in the directory
\EXAMPLES\GARCH of the distribution diskette, files gnpn.dat, gre1.m, gre2.m, gre3.m and
arch8.m.

&KDS  3DJ 

Forecasting and monitoring of objectives


Many firms define growth objectives for strategic variables. Although these goals are usually
determined for a medium term period - often one year - it is desirable to monitor the degree to
which they are met with higher frequency. To do this, it is necessary to compute intermediate
objectives which should be consistent with both medium term goals and the target variable dynamics.
Consider the well-known series G of international airline passengers, from January 1949 to
December 1960, see Box, Jenkins and Reinsel (1994). This data is adequately represented by an
IMA(1,1) IMA(1,1)12 process:
(1 B) (1 B 12) yt

( 1   B ) ( 1   B 12 ) at

(7.21)

The following code defines and estimates this model:


e4init
load airline.dat
y=log(airline);
% Defines the nonstationary version of the airline model
% The parameters corresponding to the unit roots are constrained
[theta, din, lab] = arma2thd([-1],[-1],[0],[0],[0],12);
theta=[theta zeros(size(theta))];
theta(1,2)=1;
theta(2,2)=1;
% Computes preliminary estimates
theta=e4preest(theta,din,y);
% ... and then ML estimates
[thopt,it,lval,g,h]=e4min('lffast', theta, '', din, y);
[std, corrm, varm, Im] = imod(thopt,din,y);
prtest(thopt,din,lab,y,it,lval,g,h,std,corrm);

and yields the following output:


******************** Results from model estimation ********************
Objective function: -232.7503
# of iterations:
15
Information criteria: AIC = -3.1910, SBC = -3.1291
Parameter
Estimate
Std. Dev.
FR1(1,1)
*
-1.0000
0.0000
FS1(1,1)
*
-1.0000
0.0000
AR1(1,1)
-0.4018
0.0766
AS1(1,1)
-0.5569
0.0738
V(1,1)
0.0367
0.0022
* denotes constrained parameter

t-test
0.0000
0.0000
-5.2438
-7.5450
16.9706

Gradient
0.0000
0.0000
0.0002
0.0001
0.0012

************************* Correlation matrix **************************


AR1(1,1)
1.00
AS1(1,1)
0.00 1.00
V(1,1)
0.00 0.00 1.00
Condition number =
1.0001
Reciprocal condition number =
0.9999
***********************************************************************

which is very similar to the results reported by Box, Jenkins and Reinsel (1994).

&KDS  3DJ 

Assume that a 25% growth objective is defined for next year. To track this target it would be
adequate to compute monthly growth objectives consistent with the end-year objective. This monthly
growth path can be computed by applying fismiss to an augmented time series which contains a)
past observations, b) eleven missing values, corresponding to the first months of the year and c) a
twelfth value equal to the sales objective. The following code computes the unconditional forecasts
for the series, a set of forecasts conditional to the exact accomplishment of the growth target, and
plots the results:
N=size(airline,1);
[yfor,Bfor]=foremod(thopt,din,y,12);
Bfor=sqrt(Bfor);
% Computes the conditional forecasts and plots all the results
yobj = log(airline(N,1)*1.25); % End year target
yext = [y; NaN*ones(11,1); yobj];
[yhat Bhat] = fismiss(thopt,din,yext);
figure;
hold on
plot([exp(y((N-23):N,1));yfor*NaN],'k-')
plot([y((N-23):N,1)*NaN;exp(yfor)],'k--');
plot([y((N-23):N,1)*NaN;exp(yhat(N+1:N+12))],'ko');
plot([y((N-23):N,1)*NaN;exp(yfor+1.96*Bfor)],'k-.');
plot([y((N-23):N,1)*NaN;exp(yfor-1.96*Bfor)],'k-.');
grid
whitebg('w');
hold off

the output is a plot which displays the last two years of data, the forecasts, their 95% confidence
interval (in continuous line) and the projected growth path to the target (denoted by o):
800
750
700
650
600
550
500
450
400
350
300

10

15

20

25

30

35

40

The code and data required to replicate this case can be found in the directory
\EXAMPLES\AIRLINE of the distribution diskette, files airline.m and airline.dat.

&KDS  3DJ 

Disaggregation of value added in industry


The analysis of irregularly observed time series is an important problem faced by analysts. E4 has
several functions that can be used to a) estimate models relating time series observed at unequally
spaced intervals and b) to estimate high frequency data from a low frequency sample.
This section shows these capacities by disaggregating the yearly series of value added in industry
(VAI) using as an indicator the monthly series of the index of industrial production (IIP). Both series
correspond to the Spanish economy and the sample includes data from 1975 to 1995.

(VWLPDWLRQ RI WKH KLJK IUHTXHQF\ GDWD PRGHO


The first problem that arises when analyzing irregularly observed time series refers to the
specification of high frequency model from low frequency data. After a standard analysis of sample
information, the following transfer function was found adequate to model the relationship between
VAI and IIP in the low frequency (yearly) sampling interval:

$t

.268 $t 1
(.079)

/ N t
J t

 6.705 t  N t
(.650)
J
)

96.234

(7.22)

(15.216)
where $t denotes VAI in year t,
Jt is a white noise process.

12

t


ti , being ti the value of IIP in i-th month of t-th year, and
i
1

We will assume that the high frequency relationship between these variables is coherent with the low
frequency transfer function. There are several models satisfying this constraint. For example, the
following models are coherent with the relationship specified for the yearly series:
%ti

.268 %(t 1)i  6.705 ti  nti

/12 nti
/ ti

/
)

) J / 12
27.780

(7.23)

and:
%ti

(1

.268 %(t 1)i  6.705 ti  nti

 0.265 B 12 ) / nti
/ ti

) /
2.928

(7.24)

&KDS  3DJ 

where %t i is the VAI in month i, year t.

'LVDJJUHJDWLRQ IURP QRQVWDWLRQDU\ PRGHOV


Assuming that the model (7.23) is adequate, it has to be formulated in a toolbox standard form. For
example, it can be written as a nonstationary transfer function:

%ti

6.705 ti
1 .268 B 12

 nti ; ( 1 1.268 B 12  .268 B 24 ) nti


/ ti

(7.25)

The data loading and the translation of model (7.25) to the equivalent THD format can be done by
means of the following commands:
% Disaggregation of value added in industry
% First, using nonstationary models
e4init;
load ipi.dat; load vai.dat;
x = ipi; y = vai;
[theta,din, lab] = tf2thd([],[-1.268 .268],[],[], [27.780], ...
12, [6.705], [0 0 0 0 0 0 0 0 0 0 0 -.268]);

Note that, since the model includes a seasonal structure in the transfer function, the denominator of
the transfer function is defined as a 12th order polynomial, whose first eleven coefficients are zero.
Finally we build an aggregate series. Assuming that vector y contains the VAI series, then:
yagr = NaN*zeros(252,1);
yagr(12:12:252) = y;

where 252 is the sample size. Thus, yagr contains information on the last month of the year, which
corresponds to the sum of the monthly values of VAI. Last, we call the aggrmod function and plot
the results:
[yhat, vyhat] = aggrmod(theta, din, [yagr x], 12);
figure; whitebg('w');
title('Variance of the monthly VAI')
plot(vyhat);
plotsers(yhat,0,'monthly VAI');

where the IIP monthly data, which serves as indicator, is contained in x and the disaggregate data of
VAI is stored in yhat. The output shows that the variance of the interpolated data grows as we
move away from the forecast origin. This happens because the variable is nonstationary.

&KDS  3DJ 

Standardized plot of monthly VAI


3

Variance of the monthly VAI

x 10

2.5
2

2.5

1.5
1

0.5
0

1.5

-0.5
1

-1
-1.5

0.5

-2
-2.5
50

100

150

200

250

0
0

50

100

150

200

250

300

To perform the disaggregation with the alternative model (7.24) we can use the commands:
[theta2,din2] = tf2thd([-1],[-.003 -.07],[],[], [2.928],...
12, [6.705], [0 0 0 0 0 0 0 0 0 0 0 -.268]);
[yhat2, vyhat2] = aggrmod(theta2, din2, [yagr x],12);
figure; whitebg('w');
plot(vyhat2);
title('Variance of the monthly VAI')
plotsers(yhat2,0,'monthly VAI');

which generate the following plots:


Standardized plot of monthly VAI

Variance of the monthly VAI


30

2.5
2
25

1.5
1

20

0.5
0
-0.5

15

-1
-1.5

10

-2
-2.5
50

100

150

200

250

5
0

50

100

150

200

250

300

Note that the profile of the interpolated VAI is very similar in both cases. The main differences are
in the variances which, in this last case, are bounded and substantially lower than those
corresponding to the first model. This happens because the second model implies that both variables
are seasonally cointegrated.

'LVDJJUHJDWLRQ IURP D VWDWLRQDU\ PRGHO


An alternative way to disaggregate VAI with bounded uncertainty consists of writing the stationary
version of (7.23):
&KDS  3DJ 

/12%ti
.268 /12 %(t 1)i  6.705 /12 ti  ti

(7.26)

so the commands to perform the disaggregation and display the results in this new case are:
dyagr=transdif(yagr,1,0,1,12)
dx=transdif(x,1,0,1,12);
[dtheta,ddin]=arma2thd([],[-.268],[],[], [27.780], 12,[6.705], 1);
[dyhat,dvyhat]=aggrmod(dtheta,ddin,[dyagr dx],12);
plotsers(dyhat,0,'annual increments of monthly VAI');

Standardized plot of annual increments of monthly VAI


3
2
1
0
-1
-2
-3
50

100

150

200

In this case the variance of the interpolated series is constant, due to the stationarity of the model.
The code and data required to replicate this case can be found in the directory \EXAMPLES\AGGR
of the distribution diskette, files ipi.dat, vai.dat and aggr1.m.

Models with observation errors: Wlfers sunspots data


This section illustrates the specification and estimation of models with observation errors using
Wlfers classic time series of total number of sunspots from 1749 to 1924.

8QLYDULDWH PRGHOLQJ
A previous analysis of the data reveals that there are outliers in the observations corresponding to
1777, 1786, 1836, 1848 and 1870. After an intervention analysis to remove these effects, the
standard specification tools suggest an AR(2) structure, but an overparametrization exercise reveals
that all the parameters of an ARMA(2,2) process:

&KDS  3DJ 

(1  11 B  12 B 2 ) (SCt SC )

( 1  1 B  2 B 2 ) Jt

(7.27)

are significant. Then, this will be our tentative univariate model. To estimate it one may use the
following code:
e4init
load wolfercc.dat;
wolf10 = wolfercc/10;
wolf10=wolf10-mean(wolf10);
% Defines an ARMA(2,2) model and computes preliminary estimates
[theta1, din1, lab1] = arma2thd([0 0],[],[0 0],[],[0],1);
theta1=e4preest(theta1,din1,wolf10);
% ML estimation
[thopt1,it,lval,g,h]=e4min('lffast', theta1, '', din1, wolf10);
[std, corrm, varm, Im ] = imod(thopt1, din1, wolf10);
prtest(thopt1,din1,lab1,wolf10,it,lval,g,h,std,corrm);
% Computation of residuals and diagnosis
[ehat,vT,wT,vz1,vvT,vwT]=residual(thopt1,din1,wolf10);
descser(ehat,'sunspots data: residuals of ARMA(2,2)');
plotsers(ehat,0,'sunspots data: residuals of ARMA(2,2)');
uidents(ehat,25,'sunspots data: residuals of ARMA(2,2)');

Note that the series is scaled to homogenize the metrics of all the parameters. This is an advisable
practice to reduce round-off errors. The estimation and diagnosis results are:
******************** Results from model estimation ********************
Objective function: 280.9808
# of iterations:
17
Information criteria: AIC =
3.2498, SBC =
3.3399
Parameter
FR1(1,1)
FR2(1,1)
AR1(1,1)
AR2(1,1)
V(1,1)

Estimate
-1.4511
0.7656
-0.1764
0.2105
1.1850

*************************
FR1(1,1)
1.00
FR2(1,1)
-0.89 1.00
AR1(1,1)
0.69 -0.65
AR2(1,1)
0.50 -0.30
V(1,1)
0.01 -0.01

Std. Dev.
0.0756
0.0646
0.1024
0.0915
0.0632

t-test
-19.2011
11.8439
-1.7227
2.3005
18.7580

Gradient
0.0003
0.0000
0.0004
0.0000
0.0003

Correlation matrix **************************


1.00
0.23
0.00

1.00
0.00

1.00

Condition number = 32.9985


Reciprocal condition number =
0.0284
***********************************************************************
*****************

Descriptive statistics

*****************

--- Statistics of
Valid observations
Mean
Standard deviation
Skewness
Excess Kurtosis
Quartiles
Minimum value
Maximum value
Jarque-Bera
Dickey-Fuller
Dickey-Fuller
Outliers list

sunspots data: residuals of ARMA(2,2) --= 176


= -0.0018, t test = -0.0202
=
1.1832
=
0.4676
=
0.1654
= -0.8183, -0.1317,
0.7015
= -3.0700, obs. # 170
=
3.6886, obs. # 169
=
6.6134
= -2.0661, computed with 13 lags
= -13.3641, computed with
1 lags

&KDS  3DJ 

Obs #
3
13
14
19
38
87
99
103
120
169
170

Value
-2.3786
2.5093
-2.5745
2.4281
2.5026
3.3564
2.3964
2.6194
2.7278
3.6886
-3.0700

************************************************************
Standardized plot of sunspots data: residuals of ARMA(2,2)
3

A.C.F. of sunspots data: residuals of ARMA(2,2), LBQ = 34.43


1
0.5

0
1

-0.5
-1

5
10
15
20
P.A.C.F. of sunspots data: residuals of ARMA(2,2)

25

1
-1
0.5
-2

0
-0.5

-3
20

40

60

80

100

120

140

160

-1

10

15

20

25

The analysis of residuals indicates that they could be non-normal, perhaps due to some remaining
outliers. On the other hand, there are no symptoms of any remaining autocorrelation structure, so we
consider the model statistically adequate.

0RGHO ZLWK REVHUYDWLRQ HUURUV


Let be an AR(2) model with white noise observation errors:

(1  11 B  12 B 2 ) SCt


SC t SC
SCt  vst

Jt

(7.28)
(7.29)

where v st is the observation error, SC t is the observed number of sunspots in t and SC t is the
true number of sunspots. As it is well known, this model is observationally equivalent to an
ARMA(2,2) with complex constraints over its parameters. To estimate the model (7.28)-(7.29) and
perform a standard diagnosis, we can use the commands:
% Defines an AR(2)+white noise and computes preliminary estimates
[th1, d1, l1] = arma2thd([0 0],[],[],[],[0],1);
[th2, d2, l2] = arma2thd([],[],[],[],[0],1);
[theta,din,lab]= stackthd(th1,d1,th2,d2,l1,l2);
[theta2,din2] = comp2thd(theta,din,lab);
lab2 = str2mat(l1,'Vu');
&KDS  3DJ 

theta2 = e4preest(theta2,din2,wolf10);
% Estimation
[thopt2,it,lval,g,h]=e4min('lffast', theta2, '', din2, wolf10);
[std, corrm, varm, Im ] = imod(thopt2, din2, wolf10);
prtest(thopt2,din2,lab2,wolf10,it,lval,g,h,std,corrm);
period = 2*pi/acos(-thopt2(1,1)/(2*sqrt(thopt2(2,1))));
disp(sprintf('period = %4.2f years', period));
% Validation
[ehat,vT,wT,vz1,vvT,vwT]=residual(thopt2,din2,wolf10);
descser(ehat,'sunspots data: residuals of AR(2)+error');
plotsers(ehat,0,'sunspots data: residuals of AR(2)+error');
uidents(ehat,25,'sunspots data: residuals of AR(2)+error');

which yield the output:


******************** Results from model estimation ********************
Objective function: 282.1741
# of iterations:
9
Information criteria: AIC =
3.2520, SBC =
3.3240
Parameter
FR1(1,1)
FR2(1,1)
V(1,1)
Vu

Estimate
-1.5220
0.8081
0.9686
0.3885

Std. Dev.
0.0527
0.0512
0.0893
0.0681

t-test
-28.9039
15.7817
10.8449
5.7067

Gradient
0.0000
0.0000
0.0001
0.0000

************************* Correlation matrix **************************


FR1(1,1)
1.00
FR2(1,1)
-0.88 1.00
V(1,1)
0.46 -0.43 1.00
Vu
-0.40 0.37 -0.59 1.00
Condition number = 21.5291
Reciprocal condition number =
0.0520
***********************************************************************
period = 11.19 years

Note that these results do not reject the hypothesis of observation errors as the estimate of its
standard deviation is significant. Also, the values of the likelihood function and information criteria
are very similar for both representations, indicating that they have the same explanatory power.
The following residual analysis does not suggest any alternative specification.
*****************

Descriptive statistics

*****************

--- Statistics of sunspots data: residuals of AR(2)+error --Valid observations = 176


Mean
= -0.0003, t test = -0.0036
Standard deviation =
1.1915
Skewness
=
0.4999
Excess Kurtosis
=
0.2197
Quartiles
= -0.8169, -0.1622,
0.6533
Minimum value
= -3.2339, obs. # 170
Maximum value
=
3.7674, obs. # 169
Jarque-Bera
=
7.6841
Dickey-Fuller
= -2.0962, computed with 13 lags
Dickey-Fuller
= -13.1245, computed with
1 lags
Outliers list
Obs #
Value
3
-2.4194
14
-2.5054
19
2.4211
38
2.6037
&KDS  3DJ 

87
99
103
120
169
170

3.4437
2.4903
2.6257
2.6230
3.7674
-3.2339

************************************************************

Standardized plot of sunspots data: residuals of AR(2)+error


3

A.C.F. of sunspots data: residuals of AR(2)+error, LBQ = 38.07


1
0.5

2
0
1

-0.5
-1

5
10
15
20
P.A.C.F. of sunspots data: residuals of AR(2)+error

25

-1

0.5
-2

0
-0.5

-3
20

40

60

80

100

120

140

160

-1

10

15

20

25

Finally, we can clean the data with the smoothed estimates of the observation errors with the code:
% Compute the 'clean' series
sunsp=wolf10-vT(:,2);
figure;
hold on
plot(sunsp,'k-')
plot(wolf10,'ko');
grid
whitebg('w');
hold off
title('Plot of smoothed versus original sunspots (scaled deviations)');

which generates also the following plot, in which the original data is represented by a hollow circle
whereas the smoothed series is represented by a continuous line:
Plot of smoothed versus original sunspots (scaled deviations)
10

-5

&KDS  3DJ 

50

100

150

200

The code and data required to replicate this case can be found in the directory
\EXAMPLES\WOLFER of the distribution diskette, files wolfercc.dat and wolfer.m.

Structural time series models


A recent trend in econometrics proposes the use of structural time series models. In this example we
use this methodology to build a model for annual Belgium GDP data, from 1950 to 1986, see
Garca-Ferrer et al. (1996). These authors propose the following specification:

Tt  Jt
/ T t
St 1
/ S t
t
yt

where y t is the log of GDP, T t is a trend variable, St is the change of the trend and Jt , t are
independent white noise processes. Thus, the only unknown parameters of the model are the noise
variances.
It is frequent in the literature to set the ratio between the variances (which is known as noisevariance ratio or NVR) to some heuristic small value. Given this value, it is easy to estimate the
trend, whose behaviour is described by an ARIMA model. With E4 it is possible to obtain exact
maximum likelihood estimates of both variances. This requires one to define an SS version of the
previous model:
Tt
St

yt

1 1 Tt 1
0 1 St 1

[ 1 0]

Tt
St

0
1

t

 Jt

(7.35)

(7.36)

The data loading and model formulation requires the following code:
% Non observable components model
e4init
load belgi.dat;
y = log(belgi)*1000;
[theta,din,lab]=ss2thd([1 1; 0 1],[],[0;1],[1 0],[],[1],[0],[0],[0]);

Next, one must constrain the values of the known parameters and compute preliminary estimates for
the rest.

&KDS  3DJ 

% All parameters except the variances are constrained


theta = [theta ones(12,1)];
theta(10,2)=0;
theta(12,2)=0;
% Compute preliminary estimates
theta=e4preest(theta,din,y);

And the commands for model estimation are:


%... and optimize the likelihood function.
[thopt,it,lval,g,h]=e4min('lffast', theta,'', din, y);
[std, corrm, varm, Im ] = imod(thopt, din, y);
prtest(thopt,din,lab,y,it,lval,g,h,std,corrm);
NVR=thopt(10,1)/thopt(12,1);
disp(sprintf('NVR = %4.2f ', NVR));

which yields:
******************** Results from model estimation ********************
Objective function: 155.3452
# of iterations:
12
Information criteria: AIC =
8.7414, SBC =
8.8294
Parameter
Estimate
Std. Dev.
PHI(1,1)
*
1.0000
0.0000
PHI(2,1)
*
0.0000
0.0000
PHI(1,2)
*
1.0000
0.0000
PHI(2,2)
*
1.0000
0.0000
E(1,1)
*
0.0000
0.0000
E(2,1)
*
1.0000
0.0000
H(1,1)
*
1.0000
0.0000
H(1,2)
*
0.0000
0.0000
C(1,1)
*
1.0000
0.0000
Q(1,1)
131.0239
59.9687
S(1,1)
*
0.0000
0.0000
R(1,1)
99.6790
35.3129
* denotes constrained parameter

t-test
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
2.1849
0.0000
2.8227

Gradient
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

************************* Correlation matrix **************************


Q(1,1)
1.00
R(1,1)
-0.30 1.00
Condition number =
1.8462
Reciprocal condition number =
0.5416
***********************************************************************
NVR = 1.31

Note that the ML estimate of the NVR is much higher than the values usually assumed by the
practitioners of this methodology, implying a highly adaptive trend. This is not surprising, as the ML
criterion select the values of the parameters (in this case, both variances) that allow a closer
replication of the data movements.

&RPSXWDWLRQ DQG PRGHOLQJ RI XQREVHUYDEOH FRPSRQHQWV


The second stage of the analysis requires the extraction of the trend. We can do this with fismod by
means of the following commands:

&KDS  3DJ 

% Smoothing to estimate the trend


[xhat,phat,e]=fismod(thopt,din,y);
trend=xhat(:,1);
deltat=transdif(trend,1,1);
figure;
hold on
plot(trend/1000,'k-')
plot(y/1000,'ko');
grid
whitebg('w');
hold off
title('plot of log(PIB) versus trend');
plotsers(deltat,0,'changes of the trend');

which generate the plots:


plot of log(PIB) versus trend

Standardized plot of changes of the trend

8.2

7.8

7.6

7.4

-1

7.2

-2

7
0

10

15

20

25

30

35

40

-3

10

15

20

25

30

35

The last stage of this analysis involves building a univariate model for the trend. After a preliminary
analysis (not shown) an ARIMA(2,2,0) model is fitted and diagnosed with the code:
% Model for the trend:
y1=transdif(xhat(:,1),1,2);
[theta3, din3, lab3] = arma2thd([0 0],[],[],[],[0],1);
% Computes preliminary and ML estimates
theta3=e4preest(theta3,din3,y1);
[thetan,it,lval,g,h]=e4min('lffast',theta3,'',din3, y1);
[std,corrm,varm,Im] = imod(thetan,din3,y1);
prtest(thetan,din3,lab3,y1,it,lval,g,h,std,corrm);
% Residual diagnostics
[ehat,vT,wT,vz1,vvT,vwT]=residual(thetan,din3,y1);
descser(ehat,'residuals of the model for the trend');
uidents(ehat,10,'residuals of the model for the trend');
plotsers(ehat,0,'residuals of the model for the trend');

which yields the following output:

&KDS  3DJ 

******************** Results from model estimation ********************


Objective function: 108.2797
# of iterations:
11
Information criteria: AIC =
6.5459, SBC =
6.6805
Parameter
FR1(1,1)
FR2(1,1)
V(1,1)

Estimate
-0.7023
0.3744
33.5723

Std. Dev.
0.1614
0.1624
8.1484

t-test
-4.3508
2.3060
4.1201

Gradient
-0.0002
0.0001
0.0000

************************* Correlation matrix **************************


FR1(1,1)
1.00
FR2(1,1)
-0.51 1.00
V(1,1)
0.03 -0.03 1.00
Condition number =
3.0569
Reciprocal condition number =
0.3561
***********************************************************************
*****************

Descriptive statistics

*****************

--- Statistics of residuals of the model for the trend --Valid observations =
34
Mean
= -0.3677, t test = -0.3673
Standard deviation =
5.8371
Skewness
= -0.1785
Excess Kurtosis
= -0.9368
Quartiles
= -4.2303, -0.0981,
4.5046
Minimum value
= -11.6458, obs. #
26
Maximum value
= 10.9150, obs. #
8
Jarque-Bera
=
1.4239
Dickey-Fuller
= -3.4332, computed with
5 lags
Dickey-Fuller
= -6.0014, computed with
1 lags
Outliers list
Obs #
Value
************************************************************

Standardized plot of residuals of the model for the trend


3

A.C.F. of residuals of the model for the trend, LBQ = 17.48


1
0.5

0
1

-0.5
-1

3
4
5
6
7
8
9
P.A.C.F. of residuals of the model for the trend

1
-1
0.5
0

-2

-0.5
-3

10

15

20

25

30

-1

The code and data required to replicate this case can be found in the directory
\EXAMPLES\NONOBS of the distribution diskette, files belgi.dat and belgi.m.

&KDS  3DJ 

8 Reference guide
0RGHO IRUPXODWLRQ

form2THD

Where form may be ARMA, STR, TF, SS or GARC. Converts a VARMAX


(ARMA), structural econometric model (STR), transfer function (TF), SS model
(SS) or model with GARCH errors (GARC) to THD format.

COMP2THD

Converts a stacked model in a components model in THD format.

NEST2THD

Converts a stacked model in a nested model in THD format.

STACKTHD

Stacks to models in THD format.

THD2form

Where form may be ARMA, STR, TF or SS. Converts a model in THD format to
the corresponding VARMAX, structural econometric, transfer function or SS
formulation.

TOMOD

Suppresses the user model flag in a THD model specification.

TOUSER

Adds the user model flag in a THD model specification.

0RGHO LQIRUPDWLRQ

PRTMOD

Displays information about a model.

PRTEST

Displays the estimation results.

0RGHO HVWLPDWLRQ

E4PREEST

Computes a fast estimate of the parameters for a model in THD form.

LFMOD

Computes the exact log-likelihood function for a model in THD form.

LFFAST

A faster version of LFMOD.

LFMISS

Same as LFMOD, but allowing for missing data.

LFGARCH

Same as LFMOD, but allowing for GARCH errors.

GMOD

Computes the analytical gradient of LFMOD.

GMISS

Computes the analytical gradient of LFMISS.

GGARCH

Computes the gradient of LFGARCH.

IMOD

Computes the exact information matrix of LFMOD and LFFAST.

IMODG

Computes the quasi-maximum likelihood information matrix of LFMOD and


LFFAST.

IMISS

Computes the exact information matrix of LFMISS.

IGARCH

Computes an analytical approximation to the information matrix of LFGARCH.

)XQFWLRQV IRU FRPSXWLQJ GHULYDWLYHV

form_DV

Where form may be SS or GARCH. Computes the derivatives of the SS matrices


of a model with respect to the i-th parameter.

form_DVP

Where form may be SS or GARC. Computes the derivatives of the SS matrices of


a model in the direction of any vector.

)RUHFDVWLQJ VPRRWKLQJ DQG VLPXODWLRQ

AGGRMOD

Disaggregates a sample of low frequency data into smoothed estimates of the


corresponding high frequency values.

FOREMOD

Computes forecasts for the endogenous variables of a model in THD form.

FOREMISS

Same as FOREMOD, but allowing for missing data.

FOREGARC

Computes forecasts for the endogenous variables and conditional variances of a


model with GARCH errors.

FISMOD

Computes fixed interval smoothing estimates of the state and observable variables
of a model in THD form.

FISMISS

Same as FISMOD, but allowing for missing data.

E4TREND

Decomposes a vector of time series analysis into trend, seasonal, cycle and
irregular components.

SIMMOD

Simulates the endogenous variables of a model in THD form.

SIMGARCH

Same as SIMMOD, but allowing form GARCH errors.

'DWD WUDQVIRUPDWLRQ PRGHO VSHFLILFDWLRQ DQG GLDJQRVLV

AUGDFT

Computes the augmented Dickey-Fuller test for unit roots.

DESCSER

Displays the main descriptive statistics for a set of time series.

HISTSERS

Displays a standardized histogram for a set of time series.

LAGSER

Generates lags and leads for a set of time series.


&KDS  3DJ 

MIDENTS

Computes and displays the multiple autocorrelation and partial autoregression


functions for a set of time series.

PLOTQQS

Plots the quantile graphs for a set of time series.

PLOTSERS

Displays a plot of centered and standardized time series versus time.

RESIDUAL

Computes the residuals of a model.

RMEDSER

Displays a scaled plot of sample means versus sample standard deviations for a
set of time series.

TRANSDIF

Applies stationarity inducing transformations (Box-Cox and differencing) to a set


of time series.

UIDENTS

Displays the univariate simple and partial autocorrelation functions for a set of
time series.

2WKHU IXQFWLRQV

E4INIT

Initializes the global toolbox options.

E4MIN

Computes the unconstrained minimum of a nonlinear function.

SETE4OPT

Allows the user to modify the toolbox options.

&KDS  3DJ 

aggrmod
3XUSRVH

Disaggregates a sample of low frequency data into smoothed estimates of the corresponding high
frequency values.

6\QRSVLV
[zhat, bt] = aggrmod(theta, din, z, per, m1)

'HVFULSWLRQ
Computes the optimal disaggregation of low frequency (say yearly) time series into high frequency
(say quarterly or monthly) time series, so the disaggregates add up to the sample data. The
unobserved high frequency values can be computed taking into account, not only the low frequency
sample information, but also high frequency indicator(s). For example, a monthly industrial
production index can be used as an indicator to disaggregate a yearly series of GNP.
The disaggregates are computed using an algorithm known as fixed-interval smoothing. See the
reference on fismiss and fismod. Further details about this method can be found in Anderson and
Moore (1979), De Jong (1989) and Casals, Jerez and Sotoca (2000).
The input arguments of aggrmod are a model in THD format (theta-din) relating all the variables
in the high frequency sampling interval, the data matrix (z), the number of observations that add up to
an aggregate (per) and the number of endogenous variables that are observed as aggregates (m1).
The input argument z should be structured in the following way:
4) The first m columns correspond to the endogenous variables and the rest to the exogenous
variables.
5) The first m1 columns correspond to endogenous variables observed in the low frequency sample
interval. The rest of the columns, up to m, correspond to high frequency endogenous variables.
6) The parameter m1 is optional. If it is not specified, aggrmod assumes that all the endogenous
variables are observed with low frequency, i.e. m1 = m.

&KDS  3DJ 

7) All the columns corresponding to variables observed with low frequency should be coded with
NaN where no observation is available. For example, the column corresponding to a quarterly
variable observed once a year would have the following structure:
[NaN NaN NaN y1 Na NaN NaN y2 . . .

NaN NaN NaN yn]'

where the values yi (i=1,...,n) correspond to the yearly observations.


8) All the exogenous variables should be low frequency data.
The output arguments of aggrmod are the optimal disaggregates of the first m1 endogenous variables
(zhat) as well as their covariances (bt).

5HIHUHQFHV
Anderson, B. D. O. and J. B. Moore (1979). Optimal Filtering. Englewood Cliffs, N.J.: Prentice
Hall.
Casals, J. M. Jerez and S. Sotoca (2000). Exact Smoothing for Stationary and Nonstationary Time
Series, International Journal of Forecasting, 16, 59-69.
De Jong, P. (1989), Smoothing and Interpolation with the State-Space Model, Journal of the
American Statistical Association, 84, 408, 1085-1088.

6HH $OVR
fismod, fismiss

&KDS  3DJ 

arma2thd

3XUSRVH
Converts a VARMAX model to THD format.

6\QRSVLV
[theta,din,lab] = arma2thd([FR1 ... FRp],[FS1 ... FSps], ...
[AR1 ... ARq],[AS1 ... ASqs],v,s,[G0 ... Gn],r)

'HVFULSWLRQ
The function arma2thd obtains the representation in THD format of the VARMAX model:
FR ( B ) FS ( B S ) y t

G ( B ) ut  AR ( B ) AS ( B S ) Jt

where B is the backshift operator, such that for any sequence x t : B k xt


the seasonal period and:

xt.k , s denotes the length of

y t is a (m1) vector of endogenous variables,


u t is a (r1) vector of exogenous variables
Jt is a (m1) vector of errors
FR ( B )
I  FR1 B  ...  FR p B p
FS ( B S )
I  FS1 B S  ...  FS P B S P

G0  G1 B  ...  G n B n
AR ( B )
I  AR1 B  ...  AR q B q
AS ( B S )
I  AS1 B S  ...  AS Q B S Q
G(B)

FR1 , ..., FR p , AR1 , ..., AR q and FS1 , ..., FS P , AS1 , ..., AS Q are (mm) matrices.
The input arguments are:
1) The parameter matrices of the regular autoregressive and moving average factors, [FR1...FRp]
and [AR1...ARq].
2) The parameter matrices of the seasonal autoregressive and moving average factors,
[FS1...FSps] and [AS1...ASqs].
3) The covariance matrix of Jt , v. If this matrix is defined as a vector, this implies the constraint of
independence between noises. In order not to impose this constraint, it is necessary to define at
&KDS  3DJ 

least the lower triangle of the matrix. This matrix cannot contain NaN. To impose independence
between two errors, the user can set the corresponding covariance to zero and, afterwards, impose
a fixed-parameter constraint on this value, see Chapter 5.
4) The scalar s, which indicates the length of the seasonal period (e.g. for nonseasonal data, s=1, if
the data is quarterly, s=4, if monthly s=12).
5) The parameter matrix [G0 ... Gn] and the number of exogenous variables, r, need to be
included only when the model contains exogenous variables.
If any of the matrices (except v) is null, it should be specified using an empty matrix, []. If any of the
elements in these matrices (except in v) are fixed values equal to zero, they should be specified with
NaN.
The output arguments are the vectors and matrices that define a model in THD format.

([DPSOH
Consider the VARMA model:

1 0
0 1

J1t
J2t

.3 0
.5 0 2 y1t
.9


B
B 
.4 0
0 0
.7
y2t

1 0
0 1

.8

B 12

J1t
J2t

1 .3
.3 1

The following code defines the parameter matrices, converts them to THD format and displays the
model structure:
FR1 = [-.3 NaN; -.4 NaN];
FR2 = [ .5 NaN; NaN NaN];
AR1 = [NaN NaN; NaN -.8];
V = [1 .3; .3 1];
c = [.9; .7];
[theta, din, lab] = arma2thd([FR1 FR2],[],[],AR1,V,12,c,1);
prtmod (theta,din,lab);

Note that the constant term is included by means of an exogenous variable.

6HH $OVR
ss2thd, str2thd, garc2thd, tf2thd, comp2thd, prtmod

&KDS  3DJ 

augdft

3XUSRVH
Computes the augmented Dickey-Fuller test autoregressive for unit roots.

6\QRSVLV
[adft] = augdft(y, p, trend);

'HVFULSWLRQ
The Dickey and Fuller (1981) statistic tests the null hypothesis of an autoregressive unit root versus
the alternative of stationarity. Further elaborations on this idea allow for autocorrelation and a
deterministic time trend, see Hamilton (1994).
The function augdft computes a version of the augmented Dickey-Fuller statistic. The input
arguments are:
1) y, a matrix with n observations of m variables.
2) p, the number of lags (plus one) in the unit root regression. The value of p should be equal or
greater than 1.
3) trend, an optional parameter to allow (trend=1) for a deterministic time trend.
If the output argument adft is specified, the function does not display the results.
When invoked without the argument trend or with trend=0 this function computes, for each of the
m variables in the matrix y, the OLS estimates of the parameters in the unit root regression:
y t
1 /yt 1   p 1 /yt p  1    ' yt 1  et
and the standard t and F statistics for the null hypothesis H0: '
1 , H0: 
0 and H0: '
1 
0 . If
trend=1, augdft computes the OLS estimates for the unit root regression:
y t
1 /yt 1   p 1 /yt p  1    ' yt 1  t  et

&KDS  3DJ 

as well as the same statistics as in the previous case and an additional t statistic for the null hypothesis
H0:
0 . In both cases the number of lags (p) should be enough to avoid autocorrelation of the
residuals of the regression.
The following Tables summarize the 95% and 90% percentiles of the above mentioned statistics, see
Hamilton (1994, Chapter 17 and Appendix B).
Table 1: 95% percentiles of the t and F statistics.
trend=0

Size of y

True model:
( 
0 and '
1 )
t-statistic

Fstatistic

25

-300

50

trend=1

True model:
(  g 0 and '
1 )
t-statistic

F-statistic

True model:
(  any value,
0 and '
1 )
t-statistic

F-statistic

518

-360

724

-293

486

-350

673

100

-289

471

-345

649

250

-288

463

-343

634

500

-287

461

-342

630

 250

-286

459

-341

625

Both statistics should be


compared with standard
t and F critical values

Table 2: 90% percentiles of the t and F statistics.


trend=0

Size of y

True model
( 
0 and '
1 )
t-statistic

F-statistic

25

-263

50
100

trend=1

True model
(  g 0 and '
1 )
t-statistic

F-statistic

True model
(  any value,
0 and '
1 )
t-statistic

F-statistic

412

-324

591

-260

394

-318

561

-258

386

-315

547

-313

539

Both statistics should be


compared with standard
t and F critical values

250

-257

381

500

-257

379

-313

536

 250

-257

378

-312

534

([DPSOH
The following code simulates 200 samples of a random walk process and calls to augdft:
&KDS  3DJ 

[theta,din,lab] = arma2thd([-1],[],[],[],[.1],1);
y = simmod(theta,din,200);
augdft(y,1);

The corresponding output will be similar to:


Augmented Dickey-Fuller results, p =

rho
=
0.9862, t-test (rho=1) = -1.3388
alpha = -0.0794, t-test (alpha=0)= -1.8595
F test (rho=1,alpha=0) =
3.6708, d.f. = 2, 197

Note that none of the null hypotheses is rejected with a 95% significance. The call:
augdft(y,1,1);

will yield a result similar to:


Augmented Dickey-Fuller results, p =

rho
=
0.9533, t-test (rho=1) = -2.1353
alpha = -0.0540, t-test (alpha=0)= -1.2006
delta = -0.0014, t-test (delta=0)= -1.7031
F test (rho=1,delta=0) =
5.7738, d.f. = 2, 196

which (adequately) rejects a time trend.

5HIHUHQFHV
Dickey, D.A. and W.A. Fuller (1981). Likelihood Ratio Statistics for Autoregressive Time Series
with a Unit Root. Econometrica, 49, 1063.
Hamilton, J.D. (1994). Time Series Analysis. Princeton, N.J: Princeton University Press.

&KDS  3DJ 

comp2thd

3XUSRVH
Converts a stacked model in a components model in THD format.

6\QRSVLV
[theta, din, label] = comp2thd(t, d, l);

'HVFULSWLRQ
The input arguments of comp2thd are a stacked model in THD format (t, d, l). The function
returns the composite model defined in THD format.

([DPSOH
The model:
yt

yt

.8  .3 yt 1 .4 yt 2  at

yt  vt

V[ a t ]

.1

V[ vt ]

.2

can be expressed in THD format with the following code:


[tha, da, laba] = arma2thd([-.3 .4], [], [], [], [.1], 1, [.8], 1);
[thc, dc, labc] = arma2thd([], [], [], [], [.2], 1);
[ts, ds, ls]= stackthd(tha, da, thc, dc, laba, labc);
[theta, din, lab] = comp2thd(ts, ds, ls);
lab=str2mat(laba,'V*(1,1)');
prtmod(theta,din,lab);

6HH $OVR
arma2thd, ss2thd, stackthd, str2thd, tf2thd, garc2thd, prtmod

&KDS  3DJ 

descser

3XUSRVH
Displays the main descriptive statistics for a set of time series.

6\QRSVLV
[stats, aval1, avect1] = descser(y, lab)

'HVFULSWLRQ
Computes and displays a set of descriptive statistics for each of the series in the matrix y. If y
contains more than one series, it also displays the correlation coefficients and the corresponding
principal components information.
The input arguments are: y, a matrix with n observations of m variables and lab, a matrix with m
rows containing descriptive names for each series. The parameter lab is optional.
The output argument stats is a matrix which contains the statistics computed for each series in this
order: number of valid observations, mean, standard deviation, skewness, excess kurtosis, the 25%,
50% and 75% percentiles, maximum value, position of the maximum value in the sample, minimum
value, position of the minimum value in the sample, Jarque-Bera statistic, the augmented DickeyFuller statistic computed with a number of lags equal to the square root of the number of observations
and with one lag, and an outliers list.
This function accepts missing observations marked with NaN, eliminating these observations before
computing the statistics. In this case some statistics (e.g. the correlation matrix or the augmented
Dickey-Fuller statistics) will not be computed.

([DPSOH
The following code generates two normal variables and computes their descriptive statistics:
y = randn(100, 2);
descser(y, str2mat('First series','Second series'));

6HH $OVR
plotsers, rmedser, plotqqs

&KDS  3DJ 

e4init

3XUSRVH
Initializes the global toolbox options.

6\QRSVLV
e4init

'HVFULSWLRQ
This command creates and initializes the internal variable E4OPTION. This variable is a 151 vector
which stores the values the global variables that control the behaviour of E4. It also displays a listing
of the default options and initializes the matrices of error and warning messages.
Bear in mind that the toolbox does not work properly if this function is not run.

6HH $OVR
sete4opt

&KDS  3DJ 

e4min

3XUSRVH
Computes the unconstrained minimum of a nonlinear function.

6\QRSVLV
[pnew,iter,fnew,gnew,hessin] = e4min(func,p,dfunc,P1,P2,P3,P4,P5)

'HVFULSWLRQ
The function e4min implements a numerical optimization procedure based on the techniques
described by Dennis and Schnabel (1983). It includes two main optimization algorithms, BFGS
(Broyden-Fletcher-Goldfarb-Shanno) and Newton-Raphson.
The operation of e4min is the following. Starting from an initial estimate of the optimal value, p, the
algorithm iterates on the objective function func, using the BFGS (default) or Newton-Raphson
search direction and computing the optimum step length. The algorithm stops when satisfying any of
two criteria: a) the relative changes in the values of the objective function are small and/or b) the
gradient vector is small. The default tolerances for convergence are set by e4init and can be
modified using sete4opt.
The input parameters are:
1) func, is a string containing the name of the objective function (e.g. 'lfvmod' or 'lfgarch').
The input arguments of this function should be the vector p and the optional parameters P1-P5.
2) p, is a vector containing the initial value of the variables in the optimization problem. When
e4min is used to estimate a model in THD format, p should be equal to theta.
3) dfunc, is a string containing the name of the function that computes the gradient of func (e.g.
'gmod' or 'ggarch'). The input arguments of this function should be the vector p and the
optional parameters P1-P5. If an analytical gradient is not required, dfunc should be an empty
string: ''. In this case, e4min uses a numerical approximation to the gradient.
4) P1-P5, are optional parameters used to feed additional information to func and dfunc. When
e4min is used for estimation of a model in THD format, P1 should be the name of the variable
that contains the din specification and P2 should be the name of the data matrix.

&KDS  3DJ 

After the end of the iterative process, e4min returns the following values: pnew, which is the value of
the unconstrained parameters; iter, the number of iterations; fnew, the value of the objective
function in pnew; gnew which is the analytical or numerical gradient of the objective function in
pnew, depending on the contents of dfunc; and finally hessin, which is the hessian of the objective

function in pnew.
The user should also take into account that:
1) It is possible to impose fixed-value constraints on any parameter by augmenting p with a second
column. The values in this column should be either zero, to indicate that the parameter in the first
column is free, or any nonzero value, when the parameter is constrained to its present value.
2) When estimating the parameters of an econometric model, the user can optimize the objective
function with respect to the error covariances (the default) or its Cholesky factors by selecting this
alternative with the sete4opt function. In the univariate case, the Cholesky factor of the variance
is the standard deviation.
3) The behaviour of e4min can be altered by using sete4opt. The specific e4min-related options
are:

Option

Description

Possible values

'algorithm'

Optimization algorithm

'bfgs', 'newton'

'step'

Maximum step length during optimization

0.1

'tolerance'

Stop criteria tolerance

1.0e-5

'maxiter'

Maximum number of iterations

75

'verbose'

Display output at each iteration

'yes', 'no'

Default option.
This is the default value. Other reasonable values are admissible.

([DPSOH
Consider the model:
zt

( 1 .7 B ) ( 1 .5 B 12 ) at

V[ at ]

.1

The following code obtains the corresponding THD format and simulates the data:
[theta, din] = arma2thd([], [], [-.7], [-.5], .1, 12);
z=simmod(theta,din,150);
z=z(51:150,1);

&KDS  3DJ 

In real applications, initial values of the parameters may be far from the optimum. Hence, it may be
convenient to obtain good starting values with e4preest and then e4min is used to minimize lfmod
using the analytical gradient:
tnew = e4preest(theta, din, z);
[thopt, iter, fnew, gnew] = e4min('lfmod', tnew, 'gmod', din, z);

The input arguments feed to e4min are the most conservative ones. In most cases, the following
syntax will provide the same results, with a much faster optimization process:
[thopt, iter, fnew, gnew] = e4min('lffast', tnew, '', din, z);

6HH $OVR
lffast, lfgarch, lfmod, ggarch, gmod, sete4opt

5HIHUHQFHV
Dennis, J. E. and R. B. Schnabel (1983). Numerical Methods for Unconstrained Optimization and
Nonlinear Equations. Englewood Cliffs, N. J.: Prentice-Hall.

&KDS  3DJ 

e4preest

3XUSRVH
Computes a fast estimate of the parameters for a model in THD format.

6\QRSVLV
theta2 = e4preest(theta, din, z)

'HVFULSWLRQ
This function provides fast and consistent estimates of the parameters in theta. These estimates are
adequate starting values for likelihood optimization with e4min.
The operation of e4preest is the following. It first obtains a subspace representation of the system,
where the future of the output is expressed as a linear function of its past and the information of the
input. The estimates are then computed as the solution of a nonlinear least squares problem. See
Casals (1997), Van Overschee and De Moor (1996) and Viberg (1995).
The input arguments are: theta and din, which define the model structure in THD format and z, a
matrix containing the values of the endogenous and exogenous variables. The values in the first
column of theta are irrelevant to the operation of e4preest, except if they are parameters
constrained to fixed-values, see Chapter 5. The estimates are returned in theta2.
The user should also take into account that:
1) If the sample is too short in comparison with the dimension of the system, the function will display
the message: The sample is too short to use e4preest. This means that the procedure does not
have enough degrees of freedom to estimate the model in subspace form. The degrees of freedom
for any model can be computed using the formula:
df
n 2 ( d  1 ) ( m  r )
where n is the number of observations of the sample, d is the model dynamics (as it appears in the
standard output of prtmod), m is the number of endogenous variables and r is the number of
exogenous variables.
2) The behaviour of e4preest can be altered by sete4opt. Adequate options for likelihood
optimization with e4min will be adequate in general for e4preest.

&KDS  3DJ 

([DPSOH
Consider the model:
zt

( 1 .7 B ) ( 1 .5 B 12 ) at

V[ at ]

.1

The following code obtains the corresponding THD format, simulates the data, computes preliminary
estimates with e4preest and, finally, computes maximum likelihood estimates:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);
z=simmod(theta,din,200); z=z(51:200,1);
theta=e4preest(theta, din, z)
[thopt] = e4min('lffast', theta,'', din, z)

6HH $OVR
sete4opt, e4min

5HIHUHQFHV
Casals, J. (1997). Mtodos de Subespacios en Econometra. Phd Thesis. Madrid: Universidad
Complutense.
Van Overschee, P. and B. De Moor (1996). Subspace Identification for Linear Systems: Theory,
Implementation, Applications. Dordretch: Kluwer Academic Publishers.
Viberg, M. (1995). Subspace-based methods for the identification of linear time-invariant systems,
Automatica, 31, 12, 1835-1851.

&KDS  3DJ 

e4trend

3XUSRVH
Decomposes a vector of time series into the trend, seasonal, cycle and irregular components implied
by an econometric model.

6\QRSVLV
[trend,season,cycle,irreg,thetat,dint,ixmodes,xhat] = ...
e4trend(theta,din,y,toinnov)

'HVFULSWLRQ
The function e4trend decomposes a vector of time series, represented by an econometric model, into
several structural components corresponding to: a) unit roots (trend component), b) seasonal roots
(seasonal component), c) stationary (nonseasonal) roots (cyclic component) and d) residuals
(irregular component). These components are additive, so the command:
trend+season+cycle+irreg-y

should return a null value.


The input arguments are a model in THD format (theta- din), a data matrix (y) and, optionally, a
logical flag (toinnov). If toinnov=1 the SS model for the data is obtained imposing a steady-state
innovations structure on the model, which allows to obtain exact estimates of the components, if
toinnov=0, (default) the original structure of the SS model is preserved. The number of rows of y
should be equal to the number of observations, its first columns correspond to the endogenous
variables and the rest to the exogenous variables. The output arguments are:
1) trend,

smoothed estimates of the trend components; it has one row per


observation and one column per each independent unit root.

2) season,

smoothed estimates of the seasonal components; it has one row per


observation and one column per independent seasonal component.

3) cycle,

smoothed estimates of the cyclic components; it has one row per


observation and one column per independent stationary component.

&KDS  3DJ 

4) irreg,

smoothed estimates of the irregular components; it has one row per


observation and one column per endogenous variable.

5) thetat, dint,

the theta-din specification corresponding to the block-diagonal SS


model.

6) ixmodes,

a vector of indexes identifying the different states. The value 1


corresponds to trend states, 2 corresponds to seasonal states and 3 to
cyclic states. It has the same number of rows as the transition matrix and
one column.

7) xhat,

a matrix of smoothed estimates of the states.

When a model does not include one of these components the function returns a null matrix.
Internally, this function proceeds as follows. First, it calls thd2ss to obtain the matrices of the SS
equivalent representation corresponding to theta-din. Second, it transforms the SS model to a
block-diagonal equivalent structure, according to the eigenvalues of the transition matrix. Third, the
block-diagonal model is feed to fismod (or fismiss, if the sample contains missing values) to
obtain estimates of the different states. Fourth, the estimates of the states are assigned to the structural
components taking into account the frequencies where they show a peak of spectral power. Then, all
the states with peaks at the zero frequency are assigned to the trend, the states with peaks at seasonal
frequencies are assigned to the seasonal component and the rest of the states are assigned to the cycle.
Finally, the components are computed by combining the states with the corresponding coefficients in
the observation matrix and returned as output arguments.

([DPSOHV
The following code simulates 200 samples of the nonstationary process:
( 1 .5 B ) ( 1 B ) ( 1 B 4 ) yt
( 1 .8B ) ( 1 .7 B 4 ) at ; at

 iid N ( 0 , .1 )

and computes its structural components.

&KDS  3DJ 

e4init
[theta,din,lab]=arma2thd([-1.5 .5],[-1],[-.8],[-.7],[.1],4);
y=simmod(theta,din,200);
[trend,season,cycle,irreg,thetat,dint,ixmodes]=e4trend(theta,din,y);
% Displays the block-diagonal SS model
[Phi,Gam,E,H,D,C,Q,S,R]=thd2ss(thetat,dint);
[ixmodes Phi]
[H]
% Plots the components
plotsers([trend y],1,str2mat('Trend','Data'));
plotsers(cycle,-1,Cycle);
plotsers(season,-1,Seasonal component);
plotsers(irreg,-1,Irregular component);

The output corresponding to the sentence [ixmodes Phi] is:


ans =
1.0000
1.0000
3.0000
2.0000
2.0000
2.0000

1.0035
0.0000
0
0
0
0

0.9919
0.9965
0
0
0
0

0
0
0.5000
0
0
0

0
0
0
0.4087
-0.8954
0

0
0
0
1.3033
-0.4087
0

0
0
0
0
0
-1.0000

so the first and second states correspond to the trend, the second state corresponds to the cycle and the
other states correspond to the seasonal component. The components are obtained combining the
smoothed estimates of these states with coefficients in the observation matrix, displayed by the
command [H]:
ans =
0.8880

0.4365

2.4631

0.4109

0.1963

0.1841

The resulting components will vary in different runs of this code, but they should be similar to:
Standardized plot of: Trend, Data

Standardized plot of Cycle


3

2
1.5

1
1
0.5
0

-0.5
Trend
Data

-1

-1
-2

-1.5
-2

-3
20

40

&KDS  3DJ 

60

80

100

120 140

160

180

20

40

60

80

100

120 140

160

180

Standardized plot of seasonal component

Standardized plot of irregular component

2.5
2

1.5
1

0.5
0

0
-0.5

-1

-1
-1.5

-2

-3

-2
-2.5
20

40

60

80

100

120 140

160

180

20

40

60

80

100

120 140

160

180

When the model for the data is a VARMAX or transfer function, or when the toinnov flag is
enabled, the components obtained with e4trend have the important property of converging to exact
values, i.e., to uncorrelated estimates with null variance. To visualize this feature with the data
previously simulated, run the code:
[xhat,phat,ehat]=fismod(thetat,dint,y);
trz=[];
for i=1:6:1200
trz=[trz;trace(phat(i:i+5,:))];
end
figure;
whitebg('w');
hold on
plot(trz,'k-')
title('Trace of covariance of smoothed states');
xlabel('Time')
hold off

which yields an output similar to:


Trace of covariance of smoothed states
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0

50

100
Time

150

200

Among other implications, this means that the values of the components at the end of the sample are
not be revised when the sample increases. This is a very desirable property, for example, when the
decomposition applied to obtain seasonally adjusted data.
&KDS  3DJ 

6HH $OVR
fismod, fismiss

&KDS  3DJ 

fismiss, fismod

3XUSRVH
Compute fixed interval smoothing estimates of the state and observable variables of a model in THD
form.

6\QRSVLV
[zhat, Pz, xhat, Px] = fismiss(theta, din, z)
[xhat, P, e] = fismod(theta, din, z)

'HVFULSWLRQ
These functions compute fixed interval smoothed estimates of the variables in a SS model, see
Anderson and Moore (1979) and De Jong (1989). Their main econometric applications are in a)
cleaning a noise contaminated sample of observation errors, b) estimating missing values in the
sample and c) computing unobservable components in a model.
The function fismiss allows missing observations in the endogenous variables data, coded by NaN,
while fismod requires a complete sample.
In both cases, the input arguments are a model in THD format (theta- din) and a data matrix (z).
The number of rows of z should be equal to the number of observations. The first columns of z
correspond to the endogenous variables and the rest to the exogenous variables.
The output arguments of fismiss are: zhat, a matrix that contains the smoothed estimates of the
observable variables; Pz, a matrix containing the sequence of covariances of zhat; xhat, the
expectation of the state vector conditional on all the sample; and Px, a matrix containing the sequence
of covariances of xhat. Unless the sample of the endogenous variables is affected by observation
errors or contain missing values, the values in zhat should coincide with those in z.
The output arguments of fismod are: xhat, the expectation of the state vector conditional on all the
sample; P, the covariance matrix of this expectation; and e, a matrix of smoothed errors in the
observation equation, computed as zt N
zt H xt N D u t .
The details of the algorithm implemented in fismiss and fismod can be found in Casals, Jerez and
Sotoca (2000).

&KDS  3DJ 

([DPSOH
Consider the stochastic process:
( 1 .4 B ) y t

( 1 .7 B )( 1 .8 B 4 ) at

V[ a t ]

.01

The following code defines the model, generates a sample with five missing observations and
interpolates them using fismiss.
[theta, din] = arma2thd([-.4], [], [-.7], [-.8], [.01], 4);
y = simmod(theta, din, 100);
y1 = y;
y1(50)=NaN; y1(52)=NaN; y1(55)=NaN; y1(56)=NaN; y1(58)=NaN;
[zhat, Pz, xhat, Px] = fismiss(theta, din, y1);
[y(48:60,1) y1(48:60,1) zhat(48:60,1)]

5HIHUHQFHV
Anderson, B. D. O. and J. B. Moore (1979). Optimal Filtering. Englewood Cliffs, N.J.: Prentice
Hall.
Casals, J. M. Jerez and S. Sotoca (2000). Exact Smoothing for Stationary and Nonstationary Time
Series, International Journal of Forecasting, 16, 59-69.
De Jong, P. (1989), Smoothing and Interpolation with the State-Space Model, Journal of the
American Statistical Association, 84, 408, 1085-1088.

&KDS  3DJ 

foregarc

3XUSRVH
Computes forecasts for the endogenous variables and conditional variances of a model with GARCH
errors.

6\QRSVLV
[yf, Bf, vf] = foregarc(theta, din, z, k, u)

'HVFULSWLRQ
The use of this function is exactly the same as that of foremod. The only difference is that it returns
an additional output argument, vf, which is the expectation of the conditional covariance matrices of
the errors.

([DPSOH
Consider the following ARMA(2,1) model with GARCH(1,1) errors, in conventional notation:
yt

1 .8 B
2
2
2
2
J ; Jt  iid ( 0 , .01 ) ; Jt 6t  iid ( 0 , h t ) ; h t
.002  .1 Jt 1  .7 ht 1
2 t
1 .7 B  .3 B

which, in the ARMA representation supported by E4 becomes:


yt

1 .8 B

1 .7 B  .3 B

Jt , such that: Jt

.01  1 .7 B vt
1 .8 B

The following code defines the model structure, simulates 200 observations and computes 10 step
ahead forecasts of both, y t and the conditional variance of the error:
% Model for the mean
[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);
% Model for the conditional variance
[t2, d2, lab2] = arma2thd([-.8], [], [-.7], [], [.01], 1);
% Full model
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
% Simulates the data and computes the forecasts
y=simgarch(theta, din, 300); y=y(101:300,1);
[yf, Bf, vf] = foregarc(theta, din, y, 10);
[yf Bf vf]

6HH $OVR
foremiss, foremod, garc2thd

&KDS  3DJ 

foremiss, foremod

3XUSRVH
Compute forecasts for the endogenous variables of a model in THD format.

6\QRSVLV
[yf, Bf] = foremiss(theta, din, z, k, u)
[yf, Bf] = foremod(theta, din, z, k, u)

'HVFULSWLRQ
The input arguments to both functions are: theta and din, which define the model in THD format;
z, a data matrix of the endogenous and exogenous variables; k, the forecast horizon and u, the values
of the exogenous variables for the forecast horizon.
The operation of these functions is as follows. They receive a model in THD format, convert it to the
corresponding SS formulation and then propagate the forecasting equations of the Kalman filter.
The function foremiss allows for missing data in z, marked by NaN, whereas foremod requires a
complete sample. Forecasts for models with GARCH errors should be computed using foregarc.
The output arguments are forecasts of the endogenous variables (yf) and their corresponding
covariances (Bf).

([DPSOH
Consider the following univariate model:
yt

( 1 .6 B ) ( 1 .4 B 4 ) Jt

V[ Jt ]

.1

The following code simulates 200 observations of y t and computes five forecasts:
[theta, din, lab] = arma2thd([], [], [-.6], [-.4], [.1], 4);
y = simmod(theta, din, 250); y = y(51:250,:);
[yf, Bf] = foremod(theta, din, y, 5);
[yf Bf]

6HH $OVR
comp2thd, foregarc

&KDS  3DJ 

garc2thd

3XUSRVH
Converts a model with GARCH errors to THD format.

6\QRSVLV
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2)

'HVFULSWLRQ
Obtains the THD format for a model (VARMAX or transfer function) with GARCH errors.
The input arguments are:
3) t1-d1, which is the THD format associated with the model for the mean.
4) t2-d2, which is the VARMAX model for the variance in THD format.
5) lab1 and lab2, which are optional labels for the parameters in t1 and t2, respectively.
The output arguments are the vectors and matrices that define a model in THD format.

([DPSOH
Consider the following ARMA(2,1) model with GARCH(1,1) errors, in conventional notation:
yt

1 .8 B
2
2
2
2
J ; Jt  iid ( 0 , .01 ) ; Jt 6t  iid ( 0 , h t ) ; h t
.002  .1 Jt 1  .7 ht 1
2 t
1 .7 B  .3 B

which, in the ARMA representation supported by E4 becomes:


yt

1 .7 B
1 .8 B
2
J , such that: Jt
.01 
v
2 t
1 .8 B t
1 .7 B  .3 B

The following code defines and displays the model structure:


% Model for the mean
[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);
% Model for the conditional variance
[t2, d2, lab2] = arma2thd([-.8], [], [-.7], [], [.01], 1);
% Full model
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
&KDS  3DJ 

prtmod(theta, din, lab);

Assume now the same model for the mean and an IGARCH(1,1) model for the conditional variance:
h t2
.002  .3 Jt 1  .7 ht2 1
2

which in ARMA form can be written as:


2

Jt

.01  N t , with: (1 B) N t
.01  (1 .7 B) vt

The following commands define the IGARCH structure by constraining the autoregressive parameter
to unity:
% Model for the mean
[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);
% Model for the conditional variance
[t2, d2, lab2] = arma2thd([-1], [], [-.7], [], [.01], 1);
% Full model
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
theta1 = [theta zeros(size(theta))]; theta1(5,2)=1;
prtmod(theta1, din, lab);

6HH $OVR
arma2thd, ss2thd, str2thd, tf2thd, prtmod

&KDS  3DJ 

ggarch, gmiss, gmod

3XUSRVH
Compute the analytical gradient of the log-likelihood function.

6\QRSVLV
g = ggarch(theta, din, z)
g = gmiss(theta, din, z)
g = gmod(theta, din, z)

'HVFULSWLRQ
The input arguments are a THD format specification, (theta- din) and a data matrix (z), which
should be structured as in the calls to lfgarch, lfmiss and lfmod (or lffast). The output
argument is g, a vector containing the gradient of the log-likelihood function in theta, see Terceiro
(1990). If theta includes a second column with constraint flags, then the gradient is computed with
respect to the free parameters.
To optimize a likelihood function with analytical derivatives, the name of the adequate function should
be passed to e4min as a parameter. Hence, ggarch should be used when optimizing lfgarch. In the
same way, gmiss and gmod should be used when optimizing lfmiss and lfmod/lffast,
respectively. The analytical gradient is also used in several inference procedures, see Engle (1984).

([DPSOH
Consider the model:
zt

( 1 .7 B ) ( 1 .5 B 12 ) at

V[ at ]

.1

The following code obtains the corresponding THD format, simulates 200 samples, computes the
maximum likelihood estimates using numerical derivatives and checks the analytical gradient:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);
z=simmod(theta,din,250); z=z(51:250,1);
[thopt, it, lval, g, h] = e4min('lffast', theta, '', din, z);
prtest(thopt, din, lab, z, it, lval, g, h);
g1 = gmod(thopt, din, z)

The analytical derivatives can be used in the optimization process with the following syntax:

&KDS  3DJ 

[thopt1, it1, lval1, g1, h1] = e4min('lffast', theta, 'gmod', din, z);

6HH $OVR
e4min, lfgarch, lfmiss, lfmod, lffast

5HIHUHQFHV
Engle, R. (1984). Wald, Likelihood and Lagrange Multiplier Tests in Econometrics, in Z. Griliches
and H.D. Intriligator (editors). Handbook of Econometrics, vol. II. Amsterdam: North-Holland.
Terceiro, J. (1990). Estimation of Dynamic Econometric Models with Errors in Variables. Berlin:
Springer-Verlag.

&KDS  3DJ 

histsers

3XUSRVH
Displays a standardized histogram for a set of time series.

6\QRSVLV
freqs = histsers(y, mode, tit)

'HVFULSWLRQ
The input arguments are y, a matrix with n observations of m variables, mode = 0 relative frequencies
(default) mode = 1 absolute frequencies; and tit, a matrix of characters whose rows contain an
optional descriptive title for each series.
The output argument, freqs, is a 2m matrix which contains the class marks and the frequency of
each of the intervals presented in the graph.

([DPSOH
y=randn(100,2);
freqs = histsers(y, 0, ['first series ';'second series'])
freqs = histsers(y, 1, ['first series ';'second series'])

6HH $OVR
descser, midents, plotsers, plotqqs, rmedser, uidents

&KDS  3DJ 

igarch

3XUSRVH
Computes an analytical approximation to the information matrix of LFGARCH.

6\QRSVLV
[dts, corrm, varm, Im] = igarch(theta, din, z)

'HVFULSWLRQ
Computes the Watson and Engle (1983) approximation to the information matrix of a model with
GARCH errors. In general, the analytical standard errors are smaller than the corresponding
numerical approximations, thus allowing for more a more powerful statistical inference.
The use of this function is exactly the same as that of imod. The only difference is that it does not
accept the optional input argument aprox.

([DPSOH
Consider the following model with GARCH(1,1) errors, in conventional notation:
2
2
2
2
y t
Jt ; Jt  iid ( 0 , .1 ) ; Jt 6t  iid ( 0 , h t ) ; h t
.01  .15 Jt 1  .75 ht 1

which, in the ARMA representation supported by E4 becomes:


y t
Jt , such that: Jt

.1  1 .75 B vt
1 .9 B

The following code defines the model structure, simulates 400 observations, computes the maximum
likelihood estimates of the parameters and prints the results:
% Model for the mean
[t1, d1, lab1] = arma2thd([], [], [], [], [.1], 1);
% Model for the conditional variance
[t2, d2, lab2] = arma2thd([-.9], [], [-.75], [], [.1], 1);
% Full model
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
y=simgarch(theta, din, 500); y=y(101:500,1);
[thopt, it, lval, g, h] = e4min('lfgarch', theta, '', din, y);
prtest(thopt, din, lab, y, it, lval, g, h);

&KDS  3DJ 

With these commands, the function prtest computes an approximation to the standard errors of the
estimates as sqrt(diag(inv(h))). To compute and display the analytical standard errors, replace
the last command by:
[std, corrm, varm, Im] = igarch(theta, din, y);
prtest(thopt, din, lab, y, it, lval, g, h, std, corrm);

6HH $OVR
lfgarch, imod, imiss

5HIHUHQFHV
Watson, M. W. and R. F. Engle (1983). Alternative Algorithms for the Estimation of Dynamic
Factor, MIMIC and Varying Coefficient Regression Models, Journal of Econometrics, 23, 3,
385-400.

&KDS  3DJ 

imiss, imod

3XUSRVH
Compute the exact information matrix.

6\QRSVLV
[std, corrm, varm, Im] = imiss(theta, din, z, aprox)
[std, corrm, varm, Im] = imod(theta, din, z, aprox)

'HVFULSWLRQ
These functions receive as input argument a model estimated by maximum likelihood in THD format,
convert it to the corresponding SS formulation and then compute the exact information matrix of the
estimates, see Terceiro (1990). In general, the exact standard errors are smaller than the
corresponding numerical approximations, thus allowing for more a more powerful statistical
inference.
The input arguments are:
1) A THD format specification, (theta- din). If theta includes a second column with constraint
flags, see Chapter 5, the information matrix will only be calculated with respect to the free
parameters.
2) A data matrix (z) whose number of rows is the number of observations. The first columns of z
should correspond to the endogenous variables, while the rest correspond to the exogenous.
3) The parameter aprox is a logical indicator, if it takes value 1 the function computes the
approximation of Watson and Engle (1983), which reduces the computational load. This argument
is optional.
The output arguments are: std, a vector containing the standard deviations of the estimates; corrm, a
matrix containing the correlation matrix of the estimates; varm, a matrix containing the covariance
matrix of the estimates; and Im, which is the information matrix.
The function imiss allows missing data in the endogenous variables. These observations should be
marked with NaN.

&KDS  3DJ 

When the model is not locally identified, the information matrix is rank deficient, which affects the
calculations of the covariance matrix, see Terceiro (1990). In this case, imod and imiss print a
warning message.

([DPSOH
Consider the model:
zt

( 1 .7 B ) ( 1 .5 B 12 ) at

V[ at ]

.1

The following code obtains the corresponding THD format, simulates 200 samples, computes the
maximum likelihood estimates using numerical derivatives and displays the results using approximate
standard errors:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);
z=simmod(theta,din,250); z=z(51:250,1);
[thopt, it, lval, g, h] = e4min('lffast', theta, '', din, z);
prtest(thopt, din, lab, z, it, lval, g, h);

With these commands, prtest computes an approximation to the standard errors of the estimates as
sqrt(diag(inv(h))). To compute and display the analytical standard errors, replace the last
command by:
[std, corrm, var, Im] = imod(theta, din, z);
prtest(thopt, din, lab, z, it, lval, g, h, std, corrm);

6HH $OVR
igarch, imodg

5HIHUHQFHV
Terceiro, J. (1990). Estimation of Dynamic Econometric Models with Errors in Variables. Berlin:
Springer-Verlag.
Watson, M. W. and R. F. Engle (1983). Alternative Algorithms for the Estimation of Dynamic
Factor, MIMIC and Varying Coefficient Regression Models, Journal of Econometrics, 23, 3,
385-400.

&KDS  3DJ 

imodg

3XUSRVH
Computes the quasi-maximum likelihood information matrix of LFMOD and LFFAST.

6\QRSVLV
[std, stdg, corrm, corrmg, varm, varmg, Im] = ...
imodg(theta, din, z, aprox)

'HVFULSWLRQ
If the model is misspecified or its errors are non-normal, optimization of the log-likelihood function
still provides consistent estimates, but the standard errors computed by imod are no longer valid. In
this case, we will speak of quasi-maximum likelihood estimation.
The function imodg computes an information matrix robust to these specification errors, see Ljung
and Caines (1979) and White (1982).
The use of this function is exactly the same as that of imod. The only difference is that it has an
additional input argument: aprox is a logical switch. If aprox=0 the function only returns the
analytical values, that should coincide with those of imod. For any other value of aprox, the function
computes the approximation of Watson and Engle (1983).
The output arguments are the exact maximum likelihood values (std, corrm, varm and Im) and the
quasi-maximum likelihood values (stdg, corrmg, varmg).

([DPSOH
Consider the model:
zt

( 1 .7 B ) ( 1 .5 B 12 ) at

V[ at ]

.1

The following code obtains the corresponding THD format, simulates 200 samples, computes the
maximum likelihood estimates using numerical derivatives and displays the results using approximate,
analytical and robust standard errors:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);
z=simmod(theta,din,250); z=z(51:250,1);
[thopt, it, lval, g, h] = e4min('lffast', theta, '', din, z);
&KDS  3DJ 

prtest(thopt, din,
[std, stdg, corrm,
prtest(thopt, din,
prtest(thopt, din,

lab, z,
corrmg,
lab, z,
lab, z,

it, lval, g,
varm, varmg,
it, lval, g,
it, lval, g,

h);
Im] = imodg(theta, din, z);
h, std, corrm);
h, stdg, corrmg);

6HH $OVR
imod, imiss, igarch

5HIHUHQFHV
Watson, M.W. and R.F. Engle (1983). Alternative Algorithms for the Estimation of Dynamic
Factor, MIMIC and Varying Coefficient Regression Models, Journal of Econometrics, 23, 3,
385-400.
Ljung, L. and P.E. Caines (1979). Asymptotic Normality of Prediction Error Estimators for
Approximate System Models, Stochastic, 3, 29-46.
White, H. (1982). Maximum Likelihood Estimation of Misspecified Models, Econometrica, 50,1,125.

&KDS  3DJ 

lagser

3XUSRVH
Generates lags and leads for a set of time series.

6\QRSVLV
[yl, ys] = lagser(y, ll)

'HVFULSWLRQ
The input arguments are y, a nk data matrix, and ll, a 1l vector containing the lags (positive
numbers) and leads (negative numbers) to be applied to all the series. This function returns yl, which
contains the lagged and/or leaded variables, and optionally ys, an nlk data matrix (nl=nmaxlag+maxlead) which contains the original variables, but resized to be conformable with yl.

6HH $OVR
transdif

&KDS  3DJ 

lffast, lfmiss, lfmod

3XUSRVH
Compute the exact log-likelihood function for a model in THD form.

6\QRSVLV
[l, innov, ssvect] = lffast(theta, din, z)
[l, innov, ssvect] = lfmiss(theta, din, z)
[l, innov, ssvect] = lfmod(theta, din, z)

'HVFULSWLRQ
The operation of these functions is as follows: they receive as input argument a model in THD format,
obtain the corresponding SS formulation and then compute the value of the exact log-likelihood
function.
The function lfmod computes the log-likelihood function for any of the supported formulations
except models with GARCH errors, which require lfgarch. When the endogenous variables sample
includes missing data, lfmiss should be used instead of lfmod. The missing values should be
marked with NaN. The algorithms implemented in these functions are described in Terceiro (1990).
The log-likelihood can also be computed using lffast, which is faster than lfmod, see Casals
Sotoca and Jerez (1999).
The input arguments are a THD format specification, (theta- din) and a data matrix (z). The
number of rows of z is the number of observations. The first columns of z correspond to the
endogenous variables and the rest to the exogenous.
The output arguments are:
1) l, which is a scalar that contains the value of the log-likelihood function in theta,
2) innov, which is the Nm matrix of one-step-ahead forecast errors:
zt

zt H xt |t 1 D u t

3) and ssvect, which is the Nn matrix of estimated state values. Its t-th row contains the filtered

&KDS  3DJ 

estimate of the state vector, size n, at time t, conditional on the information available up to t-1:
xt1 |t

0 xt |t 1 
u t  Kt zt

where K t is the Kalman filter gain.


In many applications, the names of these functions should be passed to e4min as an input argument
to compute maximum likelihood estimates of the parameters in theta. The values of the loglikelihood are also used for other purposes, such as hypotheses testing by means of likelihood-ratio
statistics.

([DPSOH
Consider the model:
zt

( 1 .7 B ) ( 1 .5 B 12 ) at

V[ at ]

.1

First, we need to obtain the corresponding THD format and simulate a sample:
[theta, din] = arma2thd([], [], [-.7], [-.5], .1, 12);
z=simmod(theta,din,250); z=z(51:250);

The following code evaluates the log-likelihood using the true values of the parameters:
l = lffast(theta, din, z)
l = lfmod(theta, din, z)
l = lfmiss(theta, din, z)

Note that the three functions return the same values. Now, the following calls to e4min compute the
maximum likelihood estimates using the faster and slower options:
[thopt, iter, lnew, gnew] = e4min('lffast', theta, '', din, z);
[thopt, iter, lnew, gnew] = e4min('lfmod', theta, 'gmod', din, z);

Finally, we generate two missing values, compute the log-likelihood and obtain the maximumlikelihood estimates:
z(30)=NaN; z(90)=NaN;
[thopt, iter, lnew, gnew] = e4min('lfmiss', theta, '', din, z);

6HH $OVR
lfgarch, sete4opt

&KDS  3DJ 

5HIHUHQFHV
Casals, J. and S. Sotoca (1997). Exact Initial Conditions for Maximum Likelihood Estimation of
State Space Models with Stochastic Inputs, Economics Letters, 57, 261-267.
Casals, J. S. Sotoca and M. Jerez (1999). A Fast and Stable Method to Compute the Likelihood of
Time Invariant State-Space Models, Economics Letters, 65, 329-337.
Terceiro, J. (1990). Estimation of Dynamic Econometric Models with Errors in Variables. Berlin:
Springer-Verlag.

&KDS  3DJ 

lfgarch

3XUSRVH
Computes the log-likelihood function of a model with GARCH errors.

6\QRSVLV
[l, innov, hominnov, ssvect] = lfgarch(theta, din, z)

'HVFULSWLRQ
The use of lfgarch is identical to that of lfmod. The only difference is that there is an additional
output argument, hominnov, which stores the sequence of standardized residuals.

([DPSOH
Consider the following model with GARCH(1,1) errors, in conventional notation:
2
2
2
2
y t
Jt ; Jt  iid ( 0 , .1 ) ; Jt 6t  iid ( 0 , h t ) ; h t
.01  .15 Jt 1  .75 ht 1

which, in the ARMA representation supported by E4 becomes:


y t
Jt , such that: Jt

.1  1 .75 B vt
1 .9 B

The following code defines the model structure, simulates 400 observations, evaluates the loglikelihood using the true values of the parameters and, finally, computes the maximum likelihood
estimates:
% Model for the mean
[t1, d1, lab1] = arma2thd([], [], [], [], [.1], 1);
% Model for the conditional variance
[t2, d2, lab2] = arma2thd([-.9], [], [-.75], [], [.1], 1);
% Full model
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
y=simgarch(theta, din, 500); y=y(101:500,1);
[l] = lfgarch(theta, din, y)
[thopt, it, lval, g, h] = e4min('lfgarch', theta, '', din, y);
prtest(thopt, din, lab, y, it, lval, g, h);

Note that the estimates for the parameters in the conventional GARCH representation can be easily
computed with the following commands:

&KDS  3DJ 

omega=thopt(1)*(1+thopt(2))
alpha=thopt(3)-thopt(2)
beta=-thopt(3)

6HH $OVR
lffast, lfmod, lfmiss, sete4opt

&KDS  3DJ 

midents

3XUSRVH
Computes and displays the multiple autocorrelation and partial autoregression functions for a set of
time series.

6\QRSVLV
[macf, mparf, Qus] = midents(y, lag, tit)

'HVFULSWLRQ
The input arguments are: a) y, a nm matrix which contains m series of n observations each; b) lag,
the maximum lag for computing the values of the autocorrelation functions; its default value is n/4;
and c) tit, which is an optional matrix of characters which contains a descriptive title for each
series. The last two parameters are optional.
The output arguments, macf and mparf, contain the simple autocorrelation and partial
autoregression matrices. The argument Qus, contains the matrix of Ljung-Box Q statistics computed
using the first lag values of macf.
The function also prints out the values of these functions and their representation in '+' '.' '-' format,
where '+' indicates a positive significant value, '-' a significant value less than zero and '.' indicates a
non-significant value. The significance of these coefficients is tested using the asymptotic standard
deviation 2/ n . The values in macf are also displayed in cross-correlation function form.

([DPSOH
The following code generates a 1002 matrix of gaussian white noise and displays ten lags of the
multiple autocorrelation and partial autoregression functions:
y=randn(100,2);
midents(y,10);

6HH $OVR
descser, plotsers, plotqqs, rmedser, uidents

&KDS  3DJ 

nest2thd

3XUSRVH
Converts a stacked model in a nested model in THD format.

6\QRSVLV
[theta, din, label] = nest2thd(t, d, nestwat, l)

'HVFULSWLRQ
The input arguments are: 1) t-d-l is the THD formulation of the stacked model and 2) nestwat is a
logical indicator, if it takes value 1 the function nested in inputs and if it takes value 0, nested in
errors.

([DPSOH
*LYHQ WKH WUDQVIHU IXQFWLRQ

yt

ZKHUH

u1t

LV VXFK WKDW

(1

.3  .6B
1 .8B
2
u1t 
Jt ; )J
1
1 .5B
1 .6B

.7 B ) u1t
a t ; )2a
.3  7KH HQGRJHQHL]DWLRQ RI WKH H[RJHQRXV

YDULDEOH UHTXLUHV WKH IROORZLQJ FRGH

%Defines the transfer function


w1 = [ .3 .6]; d1 = [-.5];
fr = [-.6];
ar = [-.8];
v = [1.0];
[t1,d1,l1]=tf2thd(fr,[],ar,[],v,1,[w1],[d1]);
% Defines the input model
[t2,d2,l2]=arma2thd(-.7,[],[],[],.3,1);
% Stacks the models and translate the stacked model to the final nested
% formulation
[theta, din, lab] = stackthd(t1, d1, t2, d2, l1, l2);
[theta, din, lab] = nest2thd(theta, din, 1, lab);
prtmod(theta,din,lab);

6HH $OVR
arma2thd, comp2thd, garc2thd, prtmod, ss2thd, stackthd, str2thd, tf2thd

&KDS  3DJ 

plotqqs

3XUSRVH
Plots the quantile graphs for a set of time series.

6\QRSVLV
[nq, yq] = plotqqs(y, lab)

'HVFULSWLRQ
The function plotqqs displays the quantile graph under normality for a set of time series. Along
with the histogram, this is a rough tool for assessing the normality of a series. In the graph, the
theoretical quantiles under normality (a straight line with unit slope) are displayed along with the
empirical quantiles obtained from the standardized series. The OLS regression of empirical over
theoretical quantiles is also shown.
The input arguments are: y, a nm matrix which contains m series of n observations each, and lab ,
a matrix of characters whose rows contain an optional descriptive title for each series.
This function returns the theoretical quantiles in nq and the empirical quantiles in yq.

([DPSOH
The following code generates 100 samples of gaussian white noise and displays the quantile plots:
y=randn(100,1);
plotqqs(y);

6HH $OVR
descser, histsers, plotsers, rmedser

&KDS  3DJ 

plotsers

3XUSRVH
Displays a plot of centered and standardized time series versus time.

6\QRSVLV
ystd = plotsers(y, mode, lab)

'HVFULSWLRQ
The input arguments of plotsers are:
1) y, a nm matrix which contains m series of n observations each,
2) mode, an optional parameter that selects the type of display. If mode = 0, each series is displayed
in a different graph (default value); if mode = 1, all the series (up to seven) represented in a single
graph; last, if mode = -1, each series is displayed in a single graph, but all of them have the same
axes.
3) lab, a matrix of characters whose rows contain an optional descriptive title for each series.
This function returns the centered and standardized series in ystd.
The resulting plot includes bands in 2. If an stationary and homoscedastic series is gaussian, a 95%
aprox. of the values should be between these bands.

([DPSOH
The following code generates and plots 100 samples of gaussian white noise:
plotsers(randn(100,1));

6HH $OVR
histsers, rmedser, plotqqs, uidents, midents

&KDS  3DJ 

prtest

3XUSRVH
Displays the estimation results.

6\QRSVLV
prtest(thopt, din, lab, y, it, lval, g, h, std, corrm, t)

'HVFULSWLRQ
The input parameters are provided by e4min and optionally by imod, imiss, igarch or imodg ,
and are the following:
1) thopt-din, the THD format specification of the model. The vector thopt is an output argument
of e4min.
2) lab, label matrix that documents the parameters in thopt.
3) y, a nm matrix which contains the m series of n observations each that have been used for model
estimation.
4) it, number of iterations. This parameter is an output argument of e4min.
5) lval, value of the log-likelihood function in thopt. This parameter is an output argument of
e4min.
6) g, gradient of the objective function in thopt. This parameter is an output argument of e4min.
7) h, hessian of the objective function in thopt. This parameter is an output argument of e4min.
8) std, vector of analytical standard deviations in thopt. This parameter is optional, and should be
computed using imod, imiss, igarch or imodg.
9) corrm, matrix of analytical correlations between the estimates in thopt. This parameter is an
output argument of imod, imiss, igarch or imodg.

&KDS  3DJ 

10)

t: Elapsed computing time. This value should be computed by the user using the MATLAB

functions tic and toc.


The parameters std, corrm and t are optional. If std and corrm are not specified, or specified with
empty matrices, [], standard errors and correlations between estimates are computed using
numerical approximations.
This function returns no output arguments.

6HH $OVR
e4min, imod, imiss, igarch, imodg, prtmod

&KDS  3DJ 

prtmod

3XUSRVH
Displays information about a model in THD format.

6\QRSVLV
prtmod(theta, din, lab)

'HVFULSWLRQ
The input argument is a model in THD format (theta- din) and, optionally, a label matrix (lab) to
document the parameters in theta. This function returns no output arguments.
The function prtmod is used mainly to check if the definition of a model is correct.

6HH $OVR
prtest

&KDS  3DJ 

residual

3XUSRVH
Computes the residuals and smoothed error estimates of a model, as well as the corresponding
covariance matrices.

6\QRSVLV
[z1, vT, wT, vz1, vvT, vwT] = residual(theta, din, z, stand)

'HVFULSWLRQ
This function is used mainly for model validation.
The input arguments are a THD format specification, (theta- din) and a data matrix (z). The
optional parameter stand selects between standardized (stand=1) or ordinary values (stand=0 or
argument omitted).
The output arguments are the following:
1) z1,

a matrix of residuals (standardized if stand=1) computed as:


zt |t 1
zt H xt | t 1 D u t
which can be interpreted as one-step-ahead forecast errors.

2) vT,

a matrix of smoothed residuals (standardized if stand=1) computed as:


zt |N
zt H xt | N D u t
This argument is returned empty in the case of GARCH models.

3) wT,

a matrix of smoothed state errors (standardized if stand=1). This argument is returned


empty in the case of GARCH models.

4) vz1, a matrix which stacks the covariance matrices of z1.


5) vvT, a matrix which stacks the covariance matrices of vT. This argument is returned empty in the
case of GARCH models.
6) vwT, a matrix which stacks the covariance matrices of wT. This argument is returned empty in the
case of GARCH models.

&KDS  3DJ 

Most empirical analyses use the innovations z1 for model validation, through testing if they could be
a sample realization of a zero-mean homoscedastic white noise process. In structural time series
models some authors, see Harvey and Koopman (1992), propose the use of the smoothed errors, also
known in this literature as auxiliary residuals, to detect outliers and structural changes in the
unobservable components.

([DPSOH
Consider the model:
zt

( 1 .7 B ) ( 1 .5 B 12 ) at

V[ at ]

.1

The following code obtains the corresponding THD format, simulates 200 samples, computes the
maximum likelihood estimates and prints the residuals:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);
z=simmod(theta,din,250); z=z(51:250,1);
[thopt, it, lval, g, h] = e4min('lffast', theta, '', din, z);
[z1] = residual(thopt, din, y)

The resulting series z1 can be then analyzed using other functions, such as descser or uidents, to
validate the model.

5HIHUHQFHV
Harvey, A.C. and Koopman, S.J. (1992). Diagnostic Checking of Unobserved-Components Time
Series Models, Journal of Business and Economic Statistics, vol. 10, 4, 377-389.

6HH $OVR
decser, uidents, midents, lffast, lfmod, lfmiss, fismod, fismiss, sete4opt

&KDS  3DJ 

rmedser

3XUSRVH
Displays a scaled plot of sample means versus sample standard deviations for a set of time series.

6\QRSVLV
[med, std] = rmedser(y, len, lab)

'HVFULSWLRQ
Computes and displays a standardized XY plot of sample means (on the X axis) versus sample
standard deviations (on the Y axis) for a set of time series.
The configuration of this plot helps to select an adequate value for the  parameter of the Box-Cox
transformation. For example, a linear relationship between the mean and the standard deviation with
positive slope indicates that the series requires a logarithmic transformation (=0). On the other hand,
a random scattering of the data points indicates that the series does not require a transformation
(=1). A nonlinear relationship indicates that the series requires a transformation with 1    1 .
The input arguments are:
1) y, matrix whose columns correspond to the series to be represented. All of them should have the
same number of observations.
2) len, number of observations to be used in computing sample means and standard deviations. For
seasonal series, an adequate choice is any integer multiple of the seasonal period.
3) lab, matrix of characters whose rows contain an optional descriptive title for each series.
The parameters len and lab are optional.
The output arguments are med, a matrix of sample means; and std, a matrix of sample standard
deviations.

([DPSOH
The following code generates and plots 100 samples of lognormal white noise:

&KDS  3DJ 

y=exp(randn(100,1));
rmedser(y,10,log-normal sample);

Note that the sample standard deviation increases linearly with the sample mean. The same plot
computed for the log-transformed series should not show any clear relationship.
rmedser(log(y),10,log-transformed data);

6HH $OVR
transdif, histsers, plotsers, uidents, midents, plotqqs

&KDS  3DJ 

sete4opt

3XUSRVH
Allows the user to modify the toolbox options.

6\QRSVLV
opt = sete4opt(o1,v1, o2,v2, o3,v3, o4,v4, o5,v5, o6,v6, ...
o7,v7, o8,v8, o9,v9, o10,v10)

'HVFULSWLRQ
The sete4opt function manages the toolbox options by modifying the internal vector E4OPTION. It
allows three different calls:
1) sete4opt, without any argument, restores the default options.
2) sete4opt('show') shows current options. If the function is called with this argument, no other
argument should be included.
3) sete4opt(option, value, ...) is the most usual call, where:
6 The argument option stands for the name of the option to be modified, and value stands for
the new choice.
6 option must be a character string, enclosed by quotes. It is enough to indicate the first three
letters.
6 value may be a character string, enclosed by quotes, or a numeric value. If it is a character
string, it is enough to indicate its first three letters.
6 A single call may contain several option-value pairs, up to a maximum of ten.
The different options and values are summarized in the following table.

Option

Description

Possible values

Functions that control the estimation process


'filter'

'scale'

Filter used in the evaluation of the likelihood


function

'kalman',

Scales matrices when computing their

'no', 'yes'

'chandrasekhar'

Cholesky decomposition during filtering.

&KDS  3DJ 

Option

Description

Possible values

'econd'

Algorithm for computing the initial value of the


state vector

'iu', 'au', 'ml',

Algorithm for computing initial state vector


covariance matrix

'lyapunov', 'zero',

Selects between estimation of the covariance


matrix or estimation of the Cholesky factor of

'variance', 'factor'

'vcond'

'var'

'zero' , 'auto'

'idejong'

the covariance matrix


Functions that control the behaviour of eemin
'algorithm'

Optimization algorithm

'step'

Maximum step length during optimization

0.1

'tolerance'

Stop criteria tolerance

1.0e-5

'maxiter'

Maximum number of iterations

75

'verbose'

Display output at each iteration

Default option.
This is the default value. Other reasonable values are admissible.

6HH $OVR
e4init

&KDS  3DJ 

'bfgs', 'newton'

'yes', 'no'

simgarch, simmod

3XUSRVH
Simulate the endogenous variables of a model in THD format.

6\QRSVLV
y = simgarch(theta, din, N, u)
y = simmod(theta, din, N, u)

'HVFULSWLRQ
The input arguments of these functions are a model in THD format (theta- din), the number of
observations to be generated (N) and the exogenous variable data matrix (u). If u=[], the model
presents no exogenous variables. The output argument is y, a nm matrix which contains the
realization of the endogenous variables.
The function simgarch is used to simulate models with GARCH errors. The rest of the formulations
supported by E4 can be simulated by simmod.
These functions operate as follows, the model received as input argument is converted to the
equivalent SS representation. Using this formulation and a white noise realization obtained with the
MATLAB function randn, a realization of the endogenous variables is computed. As a general
practice, it is advisable to omit the first observations of the sample.

([DPSOH
To obtain a realization of 200 observations of the model:
y1t
y2t

.9  .3 y1t 1  a1t

.7  .4 y1t 1  a2t .8 a2 t 4

a1t
a2t

1 .9
.9 1

the following code can be used:

&KDS  3DJ 

[theta, din, lab] = arma2thd([-.3 NaN; -.4 NaN],[],[], ...


[NaN NaN; NaN -.8], [1 .9; .9 1], 4, [.9;.7], 1);
% Generate the exogenous (constant) variable
u = ones(250,1);
% Compute the simulated sample and omit the first 50 observations
y = simmod(theta, din, 250, u);
y=y(51:250,1)

6HH $OVR
arma2thd, str2thd, ss2thd, garc2thd, tf2thd

&KDS  3DJ 

ss_dv, garch_dv

3XUSRVH
Computes the derivatives of the SS matrices of a model with respect to the i-th parameter.

6\QRSVLV
[dPhi, dGam, dE, dH, dD, dC, dQ, dS, dR] = ss_dv(theta, din, i)
[dPhi, dGam, dE, dH, dD, dC, dQ, dPhig,dGamg,dEg,dHg,dDg] = ...
garch_dv(theta, din, i)

'HVFULSWLRQ
These functions return the partial derivatives of the SS matrices of any model in THD form with
respect to the i-th parameter of theta. The function ss_dv is used for SS models and garch_dv is
used for models with GARCH errors. The derivatives provided by these functions are used internally
to compute analytic gradients and information matrices. They can be also useful to simplify the
coding of user functions, see Chapter 7.
The input arguments are a THD model definition (theta-din) and the position in theta of the
parameter for which the derivatives are computed (i). The output arguments preceded by the letter d
are derivatives of the corresponding SS matrices.
In garch_dv the output arguments dPhi, dGam, dE, dH, dD, dC and dQ are the derivatives of the SS
model for the mean, and the output arguments dPhig, dGamg, dEg, dHg and dDg are the derivatives
of the SS model for the variance.

6HH $OVR
ss_dvp, garc_dvp

&KDS  3DJ 

ss_dvp, garc_dvp

3XUSRVH
Computes the derivatives of the SS matrices of a model in the direction of any vector.

6\QRSVLV
[dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR]=ss_dvp(theta, din, p)
[dPhi,dGam,dE,dH,dD,dC,dQ,dPhig,dGamg,dEg,dHg,dDg]= garc_dvp(theta,din,p)

'HVFULSWLRQ
These functions return the partial derivatives of the SS matrices of any model in THD format in the
direction of a vector p. The function ss_dvp is used for SS models and garch_dvp is used for
models with GARCH errors.
The input arguments are a THD model definition (theta-din) and a vector chosen by the user (p).
The output arguments preceded by the letter d are derivatives of the corresponding SS matrices.
In garch_dvp the output arguments dPhi, dGam, dE, dH, dD, dC and dQ are the derivatives of the
SS model for the mean, and the output arguments dPhig, dGamg, dEg, dHg and dDg are the
derivatives of the SS model for the variance.
These functions are used mainly to simplify the coding of user functions, see Chapter 7.

6HH $OVR
ss_dv, garch_dv

&KDS  3DJ 

ss2thd

3XUSRVH
Converts a SS model to THD format.

6\QRSVLV
[theta, din, lab] = ss2thd(Phi, Gam, E, H, D, C, Q, S, R)

'HVFULSWLRQ
The function ss2thd obtains the THD format representation of any model in the form:
xt  1
zt

0 x t 
ut  E wt

H x t  D u t  C vt

where:
x t is an ( n 1 ) vector of state variables,
u t is an ( r 1 ) vector of exogenous variables,
z t is an ( m 1 ) vector of observable variables,
w t and v t are white noise processes such that: E [ w t ]
wt
E

vt

wt 1 vt2

Q S
ST R

0 , E [ vt ]
0 and

t 1t 2

being Q and R positive definite matrices.


The input arguments are the parameter matrices Phi ( 0 ), Gam (
), E (E), H (H), D (D), C (C), Q (Q),
S (S) and R (R). If any of the elements in these matrices, except in the covariances, are zero, they
should be specified with NaN.
The output arguments are the vectors and matrices that define a model in THD format.
The user should also take into account that:

&KDS  3DJ 

1) If Q and R are defined as column vectors, they are considered diagonal matrices.
2) The identity Q  R  S occurs in many formulations. To specify it, call the function without the
last two parameters, R and S.
3) For deterministic models with observation errors, one may define Q = [] and S = [], which
indicates that no error exists in the state equation.
4) To formulate models without error in the observation equation, one should define R = [] and
S = [].
5) If the state and observation errors are independent, define S = [].
6) If the matrices Gam and/or D are null ([]), the variables in u t do not affect the state and/or
observation equation.
7) If the matrices E and/or C are null, [], they are replaced internally by the identity matrix.

6HH $OVR
arma2thd, str2thd, garc2thd, tf2thd, comp2thd

&KDS  3DJ 

stackthd

3XUSRVH
Stacks to models in THD format.

6\QRSVLV
[theta, din, label] = stackthd(t1, d1, t2, d2, l2);

'HVFULSWLRQ
The input arguments are: 1) t1, d1, l1 is the THD formulation of the first model; 2) t2, d2,
l2 is the THD representation of second model. The function returns the stacked model defined in

THD format where theta = [t1; t2], din = [d1; d2],and label = [l1; l2].

([DPSOH
The model:
yt

yt

.8  .3 yt 1 .4 yt 2  at

yt  vt

V[ a t ]

.1

V[ vt ]

.2

can be stacked in THD format with the following code:


[tha, da, laba] = arma2thd([-.3 .4], [], [], [], [.1], 1, [.8], 1);
[thc, dc, labc] = arma2thd([], [], [], [], [.2], 1);
[ts, ds, ls]= stackthd(tha, da, thc, dc, laba, labc);

6HH $OVR
arma2thd, comp2thd, nest2thd, ss2thd, str2thd, tf2thd, garc2thd, prtmod

&KDS  3DJ 

str2thd

3XUSRVH
Converts a structural econometric model to THD format.

6\QRSVLV
[theta,din,lab] = str2thd([FR0 ... FRp],[FS0 ... FSps], ...
[AR0 ... ARq],[AS0 ... ASqs],v,s,[G0 ... Gg],r)

'HVFULSWLRQ
The function str2thd obtains the THD format representation of any model in the form:
FR ( B ) FS ( B S ) y t

G ( B ) ut  AR ( B ) AS ( B S ) Jt

where S denotes the length of the seasonal period, B is the backshift operator, such that for any
sequence x t : B k xt
xt.k , and y t is a (m1) vector of endogenous variables, u t is a (r1) vector of
exogenous variables, Jt is a (m1) vector of white noise errors and

FR0  FR1 B  ...  FR p B p


FS ( B S )
FS0  FS1 B S  ...  FS P B PS
G ( B )
G0  G1 B  ...  G g B g
AR ( B )
AR0  AR1 B  ...  AR q B q
AS ( B )
AS0  AS1 B S  ...  AS Q B QS
FR ( B )

The input arguments of str2thd are:


1) The matrices of the autoregressive and moving average factors, [FR0...FRp],[AR0...ARq].
2) The matrices of seasonal autoregressive and moving average factors
[FS0...FSps],[AS0...ASqs].

3) The covariance matrix of Jt , v. If this matrix is defined as a vector, the disturbances are assumed
to be independent. In order not to impose this constraint, it is necessary to define at least the lower
triangle of the matrix. This matrix cannot contain NaN. To impose independence between two
errors, the user can set the corresponding covariance to zero and, afterwards, impose a fixedparameter constraint on this value, see Chapter 5.
&KDS  3DJ 

4) The parameter s indicates the seasonal period (e.g. for nonseasonal data, s=1, if the data is
quarterly, s=4, if monthly s=12).
5) The parameter matrix [G0 ... Gg] and the number of exogenous variables, r. In this function,
the number of exogenous variables cannot be 0.
If any of the matrices (except v) is null, it should be specified using an empty matrix, []. If any of the
elements in these matrices, except in v, are null, they should be specified with NaN.
The output arguments are the vectors and matrices that define a model in THD format.

([DPSOH
Consider the structural model:

1
0

.3 y1t
1
J1t
J2t

y2t

.9 0
0 .7

.4

1
B

ut

1 0
0 1

.2

.8

J1t
J2t

1 0
0 .8

The following code defines the model matrices, converts them to THD format and displays the model
structure:
FR0 = [1 -.3; NaN 1];
AR1 = [NaN NaN; .2 -.8];
G0 = [.9 NaN ; NaN .7 ];
G1 = [NaN NaN ; NaN -.4];
v = [1 .8];
[theta, din, lab] = str2thd(FR0,[],AR1,[],v,1,[G0 G1],2);
prtmod(theta, din, lab);

Note that the constant term has been included by means of an exogenous variable.

6HH $OVR
arma2thd, ss2thd, garc2thd, tf2thd, comp2thd, nest2thd, prtmod

&KDS  3DJ 

tf2thd

3XUSRVH
Converts a transfer function model to THD format.

6\QRSVLV
[theta, din, lab] = tf2thd([fr1 ... frp], [fs1 ... fsps], ...
[ar1 ... arq],[as1 ... asqs],v,s,[w1; ...; wr],[d1; ...; dr])

'HVFULSWLRQ
The function tf2thd obtains the THD format representation of any model in the form:
yt

71(B)
1(B)

u1t

 ... 

7r(B)
r(B)

urt

S
 (B) (B S) Jt

1(B) 0(B )

where:
y t is the value of the endogenous variables at time t,
u t
[ u1 t , , urt]T is a (r1) vector of exogenous variables,
Jt is a white noise error,
n
7i (B)
7i0  7i1 B  7i2 B 2   7i n B i ; i
1 , 2 , , r
i
nd
i (B)
1  i1 B   i nd B i ; i
1 , 2 , , r
i
1(B)
1  11 B   1p B p

0(B S)
1  01 B S   0P B PS
(B)
1  1 B   q B q
( B S )
1  1 B S   Q B QS

The input arguments are:


1) The parameters of the regular and seasonal AR factors of the noise model, [fr1...frp],
[fs1...fsps].
2) The parameters of the regular and seasonal MA factors of the noise model, [ar1...arq],
[as1...asqs].

3) The variance of Jt , v.
4) The scalar s, which indicates the length of the seasonal period (e.g. for nonseasonal data, s=1, if
the data is quarterly, s=4, if monthly s=12).
&KDS  3DJ 

5) The coefficients of the polynomials 7(B) and (B), which are specified in the rows of
[w1; ...; wr] and [d1; ...; dr], respectively.
The matrices fr, fs, ar and as are row vectors. All the matrices, except W, can be empty, [], and
may include the value NaN to mark parameters with a null value.
All the arguments are required. If exogenous variables are not included in the model, a VARMA
representation should be used instead.
The output arguments are the vectors and matrices that define a model in THD format.

([DPSOH
Consider the transfer function:
.5

yt

.4  .9 u1t 2 

Nt

( 1 .7 B ) ( 1 .8 B ) at

.3 B

u2t

 Nt
V[ a t ]

.1

12

The following code defines and displays its structure:


[theta, din, lab] = tf2thd([], [], [-.7], [-.8], [.1], 12, ...
[.4 NaN NaN; NaN NaN .9; .5 NaN NaN], [NaN; NaN; -.3]);
prtmod(theta, din, lab);

6HH $OVR
arma2thd, ss2thd, str2thd, garc2thd, prtmod

&KDS  3DJ 

thd2arma, thd2str, thd2tf

3XUSRVH
Convert a simple model in THD format to the corresponding standard formulation.

6\QRSVLV
[F, A, V, G] = thd2arma(theta, din)
[F, A, V, G] = thd2str(theta, din)
[F, A, V, W, D] = thd2tf(theta, din)

'HVFULSWLRQ
Convert a simple model in THD format to the standard formulation of a VARMAX, structural
econometric model or transfer function. Hence, these are the reciprocal functions of arma2thd,
str2thd and tf2thd, respectively.

([DPSOH
The model:

1
0

.3 y1t
1

y2t

.9 0
0 .7
a1t
a2t

.4

1
B

ut

1 0
0 1

.2

.8

a1t
a2t

1 0
0 .8

can be converted to THD format using the str2thd function:


[theta, din] = str2thd([1 -.3; NaN 1],[NaN NaN; .2 -.8],...
[1 ;.8], 1, [.9 NaN NaN NaN; NaN .7 NaN -.4],2);

and the matrices in the standard representation are recovered with the command:
[F, A, V, G] = thd2str(theta, din)

The use of thd2arma is completely analogous. As for thd2tf, consider the transfer function:

&KDS  3DJ 

.5
u
1 .3 B 2t

yt

.4  .9 u1t 2 

 Nt

(1

B ) ( 1 B ) Nt
( 1 .7 B ) ( 1 .8 B ) at
12

V[ at ]

.1

12

which can be translated to THD format by the command:


[theta, din] = tf2thd([-1],[-1],[-.7],[-.8],[.1], 12,...
[.4 NaN NaN; NaN NaN .9; .5 NaN NaN], [NaN; NaN; -.3]);

and then the model polynomials are recovered using thd2tf:


[F, A, V, W, D] = thd2tf(theta,din)

6HH $OVR
arma2thd, str2thd, tf2thd

&KDS  3DJ 

thd2ss

3XUSRVH
Converts any model in THD format to the corresponding SS representation.

6\QRSVLV
[Phi, Gam, E, H, D, C, Q, S, R] = thd2ss(theta, din)

'HVFULSWLRQ
The function thd2ss is the reciprocal of ss2thd. It receives a model in THD format and returns the
matrices of its SS formulation:

xt1
zt

0 xt 
ut  E wt ; E[ w t ]
0

H x t  D u t  C vt

; E[ v t ]

where:
wt
V

vt

Q S
S R

t1t2 , t1t2

0 , t1 g t2
.
1 , t1
t2

The input argument is a THD model definition (theta-din). The output arguments are the
parameter matrices Phi (0), Gam (
), E (E), H (H), D (D), C (C), Q (Q), S (S) and R (R).

6HH $OVR
ss2thd

&KDS  3DJ 

tomod, touser

3XUSRVH
Disables or enables the user model flag in a THD model specification.

6\QRSVLV
din = tomod(din)
din = touser(din, userf, userfg)

'HVFULSWLRQ
The function tomod disables the user model flag in a THD model specification, while touser
activates the user model indicator in din and adds the user function. The input argument userf is the
name of user function, see Chapter 7, and userfg is an optional parameter for building derivatives.

6HH $OVR
arma2thd, str2thd, garc2thd, ss2thd, tf2thd

&KDS  3DJ 

transdif

3XUSRVH
Applies stationarity inducing transformations to a set of time series.

6\QRSVLV
z = transdif(y, lambda, d, ds, s)

'HVFULSWLRQ
Computes the Box-Cox (1964) transformation and the regular and seasonal differences of a time
series.
The input arguments are: a) y, a matrix whose columns correspond to the different series to be
transformed, b) lambda, the parameter of the Box-Cox transformation, c) d, the order of regular
differencing, d) ds, a (S1) matrix containing the orders of seasonal differencing (default value ds=0)
and d) s, a (S1) matrix containing the lengths of the seasonal periods (default value s=1). The last
two parameters are optional and can be omitted if seasonal differences are not required.
The output argument is the differenced and transformed series z such that:

N/

ds ()
s yt

zt

/d

s1 , s2 , , s S

)
y t(

sS

( 1 B )d

ln ( yt  )
( yt  ) 1

N (1 B )

s ds

s S

if 

if 

g0

yt()

The parameter is zero if all the values of y t are positive, and equal to min(y)  10 5 in other cases.

5HIHUHQFHV
Box, G. E. P. and D. R. Cox (1964). An Analysis of Transformations, Journal of the Royal
Statistical Society, B, 26, 211-243.

&KDS  3DJ 

uidents
3XUSRVH
Displays the univariate simple and partial autocorrelation functions for a set of time series.

6\QRSVLV
[acf, pacf, qs] = uidents(y, lag, tit)

'HVFULSWLRQ
The input arguments are: a) y, a nm matrix which contains m series of n observations each; b) lag,
the maximum lag for computing the values of the autocorrelation functions; its default value is n/4;
and c) tit, which is an optional matrix of characters which contains a descriptive title for each
series. The last two parameters are optional.
The output arguments are the matrices acf and pacf, whose columns contain the sample
autocorrelation function and partial autocorrelation function of each series, and qs, which is a 1m
vector containing the values of the Ljung-Box Q statistic for each series, computed with the first lag
values of the autocorrelation function.

([DPSOH
The following code generates a 1002 matrix of gaussian white noise and displays ten lags of the
corresponding autocorrelation functions:
y=randn(100,2);
uidents(y,10);

6HH $OVR
midents, histsers, plotsers, rmedser, plotqqs

&KDS  3DJ 

9 References
Anderson, B. D. O. and J. B. Moore (1979). Optimal Filtering. Englewood Cliffs (N.J.): Prentice
Hall.
Bollerslev, T. (1986). Generalized Autoregressive Conditional Heteroscedasticity, Journal of
Econometrics, 31, 307-327.
Bollerslev, T., R.F. Engle and D.B. Nelson (1994). ARCH Models, in R.F. Engle and D.L.
McFadden (editors), Handbook of Econometrics, vol. IV. Amsterdam: North-Holland.
Box, G. E. P. and D. R. Cox (1964). An Analysis of Transformations, Journal of the Royal
Statistical Society, B, 26, 211-243.
Box, G.E.P., G. M. Jenkins and G.C. Reinsel (1994). Time Series Analysis, Forecasting and
Control. Englewood Cliffs (N. J.): Prentice-Hall.
Casals, J. (1997). Mtodos de Subespacios en Econometra. Phd Thesis. Madrid: Universidad
Complutense.
Casals, J. and S. Sotoca (1997). Exact Initial Conditions for Maximum Likelihood Estimation of
State Space Models with Stochastic Inputs, Economics Letters, 57, 261-267.
Casals, J. S. Sotoca and M. Jerez (1999). A Fast and Stable Method to Compute the Likelihood of
Time Invariant State-Space Models, Economics Letters, 65, 3, 329-337.
Casals, J. M. Jerez and S. Sotoca (2000). Exact Smoothing for Stationary and Nonstationary Time
Series, International Journal of Forecasting, 16, 59-69.
Chatfield, C. and D.L. Prothero (1973). Box-Jenkins Seasonal Forecasting: Problems in a Case
Study, Journal of the Royal Statistical Society, A, 136, 295-336.
De Jong, P. (1989), Smoothing and Interpolation with the State-Space Model, Journal of the
American Statistical Association, 84, 408, 1085-1088.
De Jong, P. and S. Chu-Chun-Lin (1994). Stationary and Non-Stationary State Space Models,
Journal of Time Series Analysis, 15, 2, 151-166.

Dennis, J.E. and R.B. Schnabel (1983). Numerical Methods for Unconstrained Optimization and
Nonlinear Equations. Englewood Cliffs (N. J.): Prentice-Hall.
Dickey, D.A. and W.A. Fuller (1981). Likelihood Ratio Statistics for Autoregressive Time Series
with a Unit Root, Econometrica, 49, 1063.
Engle, R.F. (1982). Autoregressive Conditional Heteroskedasticity with Estimates of the Variance
of U.K. Inflation, Econometrica, 50, 987-1008.
Engle, R.F. (1984). Wald, Likelihood and Lagrange Multiplier Tests in Econometrics, in Z.
Griliches and M.D. Intriligator (editors), Handbook of Econometrics, vol. II. Amsterdam:
North-Holland.
Engle, R.F. and D. Kraft (1983).Multiperiod Forecast Error Variances of Inflation Estimated from
ARCH models in A. Zellner (editor), Applied Time Series Analysis of Economic Data.
Washington D.C.: Bureau of the Census.
Garca-Ferrer, A., J. del Hoyo, A. Novales and P. C. Young (1996). Recursive Identification,
Estimation and Forecasting of Nonstationary Economic Time Series with Applications to
GNP International Data in D.A. Berry, K.M. Chaloner and J.K. Geweke (editors),
Bayesian Analysis in Statistics and Econometrics: Essays in Honor of Arnold Zellner.
Nueva York: Jonh Wiley.
Girshick, M.A. and Haavelmo, T. (1947). Statistical Analysis of the Demand for Food: Examples
of Simultaneous Estimation of Structural Equations, Econometrica, 15, 79-110.
Grace, A. and MATLAB (1993). Optimization Toolbox. Natick, Mass.: The MathWorks Inc.
Greene, W.H. (1996). Econometric Analysis. New York: Macmillan Publishing Company.
Hamilton, J.D. (1994). Time Series Analysis. Princeton, N.J: Princeton University Press.
Harvey, A.C. (1989). Forecasting, Structural Time Series Models and the Kalman Filter.
Cambridge: Cambridge University Press.
Harvey, A.C. and Koopman, S.J. (1992). Diagnostic Checking of Unobserved-Components Time
Series Models, Journal of Business and Economic Statistics, vol. 10, 4, 377-389.

&KDS  3DJ 

Harvey, A.C. and N. Shephard (1993). Structural Time Series Models, In G.S. Maddala, C.R.
Rao, and H.D. Vinod (editors), Handbook of Statistics, vol. 11. Amsterdam: North-Holland.
Jenkins, G.M. and A.S. Alavi (1981). Some Aspects of Modelling and Forecasting Multivariate
Time Series, Journal of Time Series Analysis, 2, 1, 1-47.
Johansen, S. (1988). Statistical Analysis of Cointegration Vectors, Journal of Economic
Dynamics and Control, 12, 231-254
Johansen, S. (1991). Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian
Vector Autoregressive Models, Econometrica, 59, 1551-1580.
Kmenta, J. (1997). Elements of Econometrics. Ann Arbor: The University of Michigan Press.
Ljung, L. and P.E. Caines (1979). Asymptotic Normality of Prediction Error Estimators for
Approximate System Models, Stochastic, 3, 29-46.
MATLAB (1992). MATLAB: Reference Guide. Natick, Mass.: The MathWorks Inc.
MATLAB (1992). MATLAB: External Interface Guide. Natick, Mass.: The MathWorks Inc.
MATLAB (1996). MATLAB COMPILER: Users Guide. Natick (Mass): The MathWorks Inc.
McCullough and Vinod (1999). The Numerical Reliability of Econometric Software, Journal of
Economic Literature, XXXVII, 633-665.
McLeod, G. (1982). Box Jenkins in Practice. Lancaster: Gwilym Jenkins & Partners Ltd.
Newbold, P., C. Agiakloglou and J. Miller (1994). Adventures with ARIMA software,
International Journal of Forecasting, 10, 573-581.
Pankratz, A. (1991). Forecasting with Dynamic Regression Models. New York: John Wiley &
Sons.
Sotoca, S. (1994). Aplicacin del Filtro de Chandrasekhar a la Estimacin por Mxima
Verosimilitud Exacta de Modelos Dinmicos, Estadstica Espaola, 36, 136, 259-285.
Swamy, P.A.V.B. and G.S. Tavlas (1995). Random Coefficients Models: Theory and
Applications, Journal of Economic Surveys, 9, 2, 165-196.

&KDS  3DJ 

Terceiro, J. and P. Gmez (1985). Theoretical and Empirical Restrictions in Time Series
Analysis, Proceedings of the 5th World Congress of the Econometric Society, MIT,
Massachussets.
Terceiro, J. (1990). Estimation of Dynamic Econometric Models with Errors in Variables. Berlin:
Springer-Verlag.
Terceiro, J. (1999). Comments on Kalman Filtering Methods for Computing Information Matrices
for Time-Invariant Periodic and Generally Time-Varying VARMA Models and Samples,
Computers & Mathematics with Applications (forthcoming).
Van Overschee, P. and B. De Moor (1996). Subspace Identification for Linear Systems: Theory,
Implementation, Applications. Dordretch: Kluwer Academic Publishers.
Viberg, M. (1995). Subspace-based methods for the identification of linear time-invariant
systems, Automatica, 31, 12, 1835-1851.
Watson, M.W. and R.F. Engle (1983). Alternative Algorithms for the Estimation of Dynamic
Factor, MIMIC and Varying Coefficient Regression Models, Journal of Econometrics, 23,
3, 385-400.
Wells, C. (1996). The Kalman Filter in Finance. Dordretch: Kluwer Academic Publishers.
White, H. (1982). Maximum Likelihood Estimation of Misspecified Models, Econometrica,
50,1,1-25.

&KDS  3DJ 

Appendix A: Error and warning messages


(UURU PHVVDJHV
1. THETA and DIN do not fit
2. i inconsistent with THETA (out of range)
3. Incorrect number of arguments
4. Badly conditioned covariance matrix
5. Incorrect model specification
6. Only one series allowed
7. Should be more than 1 observation
8. Model %1d inconsistent
9. Endogenous variables model should be simple
10. Model not identified
11. Inconsistent input arguments
12. Inconsistent error model
13. User function should be passed as argument in user models
14. Incorrect model
15. Inconsistent system matrix dimension
16. Impossible to compute with missing data
17. Invalid number of lags
18. File not found: %s
19. The equation has no solution
20. SETE4OPT. Unrecognized option %s
21. SETE4OPT. Unrecognized value %s
22. SETE4OPT. Invalid value for %s
23. Run E4INIT before using E4
24. Use ARMA2THD for ARMA models
25. Initial conditions are meaningless
26. Non-stationary system. Initial conditions not compatible with Chandrasekhar
27. E4MIN. No decision variables; check second column of THETA
28. E4LNSRCH. THETA vector is meaningless
29. Multivariate time-varying parameters models are not supported
30. The sample size should be an integer multiple of the seasonal period
31. SETE4OPT. If vcond=De Jong, filter must be Kalman
32. Argument should be scalar
33. E4MIN. Objective function not found
34. For this type of model ARMA2THD or STR2THD should be used
35. Not enough data for using e4preest()

:DUQLQJ PHVVDJHV
1. Should be one title per series
2. Invalid number of lags
3. Invalid %s option
4. PLOTSERS. A maximum of seven series can be represented in mode 2
5. RMEDSERS. Invalid group length
6. LFMODINI. Roots within the circle of radius 1
7. Approximate computation of information matrix
8. Information matrix sd+ o d-. Pseudo-inverse computed
9. E4MIN. Surpassed the maximum number of iterations
11. E4MIN. Hessian reinitialized
13. E4LNSRCH. Precision problem
14. CHOLP. Matrix not square
15. Kalman filter will be used
16. E4MIN. Analytic gradient function not found. Numeric approximation used

$SSHQGL[ $ 

Appendix B: Structure of E4OPTION


This Appendix describes structure of the internal vector, E4OPTION, created by the command
e4init, see Chapter 1. E4OPTION is a 151 numeric vector, which stores the general options that

control the behaviour of E4. The first ten positions, which can be modified using sete4opt, are
defined as follows:

<

1)

E4OPTION(1,1) indicates the filter to be used in estimation. 1 Kalman,

2)

E4OPTION(1,2) indicates if the matrices are to be scaled when computing their Cholesky

3)

E4OPTION(1,3) indicates the algorithm for computing initial state vector expectation.

4)

E4OPTION(1,4) indicates the algorithm for computing the initial state vector covariance.

5)

E4OPTION(1,5) indicates whether to use the covariance matrices or their Cholesky factors

6)

E4OPTION(1,6) stores the optimization algorithm to use. 1 BFGS, 2 Newton.

7)

E4OPTION(1,7) stores the maximum step length to be used by the optimizer.

8)

E4OPTION(1,8) stores the stop criteria tolerance.

9)

E4OPTION(1,9) stores the maximum number of iterations for the optimization algorithm.

10)

E4OPTION(1,10) stores an option to display or omit output at each iteration of eemin.

2<Chandrasekhar.

decomposition during filtering. 1<Scale, 0<Do not scale

1<Maximum likelihood, 2<initializes to zero, 3<uses the first value of the exogenous,
4<uses the exogenous variables average.

1< Solution of the algebraic Lyapunov equation, 2<zero, 4<Inverse of De Jong, see De
Jong and Chu-Chun-Lin (1994).

as parameters in model estimation. 1<Covariance matrices, 2<Cholesky factors.

<

<

1<Yes, 0<No.

The values stored in E4OPTION(1,11:51) are not user-modifiable through sete4opt, as they
store numeric tolerances for internal E4 functions.

Você também pode gostar