Assignment 5

0
qwertyuiopasdfghjklzxcvbnmqwertyui
opasdfghjklzxcvbnmqwertyuiopasdfgh
jklzxcvbnmqwertyuiopasdfghjklzxcvb
nmqwertyuiopasdfghjklzxcvbnmqwer
tyuiopasdfghjklzxcvbnmqwertyuiopas
dfghjklzxcvbnmqwertyuiopasdfghjklzx
cvbnmqwertyuiopasdfghjklzxcvbnmq
wertyuiopasdfghjklzxcvbnmqwertyuio
pasdfghjklzxcvbnmqwertyuiopasdfghj
klzxcvbnmqwertyuiopasdfghjklzxcvbn
mqwertyuiopasdfghjklzxcvbnmqwerty
uiopasdfghjklzxcvbnmqwertyuiopasdf
ghjklzxcvbnmqwertyuiopasdfghjklzxc
vbnmqwertyuiopasdfghjklzxcvbnmrty
uiopasdfghjklzxcvbnmqwertyuiopasdf
ghjklzxcvbnmqwertyuiopasdfghjklzxc
vbnmqwertyuiopasdfghjklzxcvbnmqw

Forecasting WTI Oil price using Amex Oil
Index in a Regression Model

ECON 0707 Economic Forecasting Assignment 5

11/25/2013

Kevin Elie Tanujaya (2011517204)
Li Richard Ruiqi (2011541762)
Yuanqi Gao (2011575878)

1

Introduction

This papers main objective is to examine and forecast the crude oil price, by using WTI (Texas light sweet
variant) as the benchmark oil price. This will be conducted with the Regression model using Amex Oil Index as
the explanatory variable. The choice of the explanatory variable is due to three reasons. Firstly, stocks of firms
in oil-related industries represent the demand side of oil and their performance will affect the demand of oil,
and in turn affect the oil price. Thus, Amex Oil Index the price weighted index of 13 prominent firms in oil
industry, will be a good proxy. Secondly, it is conjectured that the stock market may be more efficient in
absorbing real time market information. Changes in Amex Oil index may be a leading indicator of changes in oil
prices. Lastly, this paper aims to replicate the results found in the research paper by Chen (2013)
1
. It follows
that the choice of variables, time periods and regression approach will be identical to Chens paper. The
independently obtained results will be compared to the results found in Chens paper. This paper will also
employ the tools for out of sample error analysis and visualization to derive greater understanding on how the
Regression model proposed in Chens paper can outperform conventional Random Walk model for h=1 and 3.
The forecast variable
+
is the percentage change of WTI oil price between period t to t+h. Similarly,
+
is
the percentage change of XOI Index between period t to t+h. The equation is shown in (1):
+
=
(1)
The Regression model and the point estimate for Oil price at t+h is thus:
+
= +
+
+
,
+,
= 1 + +
(2)
The forecast errors from time period R to T from the Regression model will then be compared with the forecast
errors obtained from the Random Walk (no-change) model. The forecast errors are defined in (3):

1
Chen, Shiu-Shen (2013): Forecasting Crude Oil Price Movements with Oil-Sensitive Stocks.
2

+,
=
+

+,
,
+,
=
+
(3)
The possibility of forecasting oil price changes from the previous periods XOI Index price change would be the
interest of market participants looking for arbitrage opportunities. Any findings may provide quick and
convenient trading rules as previous periods price change data of XOI Index is readily available. Performing
simple regression with past data of oil price changes is also very accessible. A positive price change forecast in
oil price obtained from Regression model could be interpreted as a signal to buy, and vice versa.
The discussion will begin by plotting the raw data series (WTI oil price and XOI Index) as well as their summary
statistics. Recursive Regression scheme will be performed for the data series for = , +1, , where
scope of h will take values of h=1,3,6,9,12, which is consistent with Chens paper. In sample fit for sample size T
h will be summarized and analyzed. The graph of time series and scatter plot of
+,
and
+,
will be presented for visualization of forecast error. Calculation of RMSPE, Theil U statistic, and
MSPE ratio statistics for out of sample forecast errors will also be used for a numerical comparison on the
performance of the two models. A heteroskedasticity consistent statistical T-test for the difference of the two
error-squared terms will be performed in order to confirm whether the difference is statistically significant.
Directional accuracy will be calculated in order to check whether the forecast is better than flipping coins
(p>0.5) direction-wise. This paper will consider h = 1, 3, 6, 9, 12 1 month, 3 months, 6 months, 9 months, and
1 year ahead forecasts will be considered. In total, there will be P forecasts and P forecast errors, where =
+1.
Data Collection and Analysis
The period selected is between 1984:M10 to 2012:M8. Each raw data sample collected is the daily last price of
the WTI Oil and XOI Index obtained from HKU Bloomberg Terminal. The conversion to monthly observation is
done by taking the day of each months observation. The definition, ticker, sample period and frequency of the
data taken are summarized in the following table:
3

Definition Ticker Sample Period Frequency Source
WTI Oil Price USCRWTIC Index 1st Oct 1984 to 31th Aug 2012 Daily Bloomberg Terminal
Amex Oil Index ^XOI 1st Oct 1984 to 31th Aug 2012 Daily Yahoo Finance
2

Table 1 Summary of the definition, ticker, sample period and frequency of the data taken

There are a total of 335 monthly observations (T = 335), and the first R observations (R=76) are used to fit the
Regression model. The choice of R made to be consistent with Chens paper which sets initial estimation
window at 1984:M10 to 1990:M12. After performing the necessary transformation by taking the percentage
change in oil price and XOI index, the variable of interest
and
is obtained. The time series plot of
and
can be observed in Figure 1, and its respective summary statistics in Table 2.

Figure 1- Time plot of yt or WTI Oil Price for time period 1984:M10 to 2012:M8

2
XOI Index data only available up to 1988 in Bloomberg Terminal
4

Figure2- Time plot of xt or XOI Index Price for time period 1984:M10 to 2012:M8

Mean Median Standard
deviation
Autocorrelations

Lag=1 Lag=2 Lag=3 Lag=4 Lag=5
WTI Oil 37.89 25.31 27.87 0.980 0.956 0.929 0.901 0.870
XOI
556.7 440.8 396.7 0.988 0.976 0.964 0.954 0.941

Table 2 Summary statistics and autocorrelation values
There are three things that are apparent from the time series plot. Firstly, there is an apparent time trend.
Beginning in 1984, the oil price stays around $20, while towards the end of 2012 the oil price fluctuates around
$100. A similar time trend is also apparent in XOI Index price. Secondly, volatility appears to increase with
respect to level of oil price/XOI Index price. Thirdly, there appears to be increased volatility in the year 2007-
2008 which can be attributed to the financial turmoil during the time period. The summary statistics is
tabulated in Table 2 as reference.

5

In sample fit analysis
Intercept
Std. Error (
) p-value (
)
h=1 0.0074 0.155 0.097 0.110
h=3 0.0284 -0.0377 0.182 0.836
h=6 0.0568 -0.143 0.251 0.569
h=9 0.0851 -0.239 0.292 0.414
h=12 0.121 -0.248 0.344 0.471

Table 3 Summary of coefficient estimates with their corresponding p values

For the in sample analysis, the largest number of sample is included when performing the forecast error
for time = . The summary of coefficient estimates and p values are summarized in Table 3. The null
hypothesis that
= 0 is tested for whether there is no effect of change in XOI Index on WTI Oil price. The p
values suggest that the effect of percentage change of XOI Index on percentage change on WTI Oil price is
statistically insignificant as the p values cannot be rejected at all conventional levels (0.01,0.05, and 0.1).
However, when h=1 the p value marginally fails to accept the null hypothesis at 10%. Thus, the predictability of
WTI Oil price is possible. Note that the intercept here are positive and can be interpreted as the time trend of
Oil price. When h=1, the intercept is at 0.0074, which means that the expected increase in WTI Oil price is
0.74%. Similar trends are seen for other values of h. For the coefficient estimate, the estimate is only positive
forh=1, which is agrees with the initial conjecture that positive performance in Oil related firms can lead to an
increase in demand for oil. For other values of h, the coefficient estimates are negative which does not follow
the initial conjecture. However as the p-values are very large, the evidence suggest that the null hypothesis
should be accepted.
An important thing to note is that the coefficient values obtained here is nowhere near the results found in
Chens paper. In Chens paper, it is found that coefficient estimate equals
= 0.43 and that p value < 0.00.

6

Out of sample forecast accuracy comparison

Figure 3 Time series plots of forecast errors under Regression based and Random Walk model for h=1,3,6,9,12

7

Figure 4- Scatter plot diagram of forecast errors under Regression based and Random Walk model for h=1,3,6,9,12
8

The forecast errors are visualized by plotting the time series graphs for the P observations. From Figure 3, it
appears that when h=1 and 3, there is a marginal difference between the Random Regression and Random
Walk model. When h=3,6, or 12, the red line tends to be below the blue line. This suggests that the regression
model tend to produce forecasts that are higher than the forecast produced by Random Walk model. Thus, the
forecast error will be lower for positive forecast errors, or higher for negative forecast errors. The nominal
forecast errors variance tends to increase with time. This may be the potential drawback for using nominal
forecast error in the calculation as opposed to using percentage forecast error.
In Figure 5, scatter plots are used to visualize how the Regression model performs for negative and positive
values of Random Walk forecast errors. It is found that when h=1, Regression model appear to outperform
Random Walk model especially for larger error values. This is seen in the scatter plots that lie above the 45
diagonal line(y=x) in the first quadrant, but lie below the 45 line in the third quadrant. This finding is more
apparent for extreme error values. For smaller error values, the two forecast errors are relatively similar. For
values of h other than 1, the scatter plots tend to be above the 45 line. The forecast error for Regression
model is thus lower than Random Walk model for positive forecast errors, but higher than the Random Walk
model for negative forecast errors.
RMSPE, Theil U ratio, and MSPE ratio statistics of models for each value of h
h=1 h=3 h=6 h=12 h=24
RMSPE(Reg) 4.64 10.22 14.95 17.27 18.62
RMSPE(RW) 4.94 10.03 14.90 16.64 17.74
Theil U 0.938 1.019 1.003 1.038 1.050
MSPE Ratio 0.879 1.038 1.007 1.077 1.102
MSPE Ratio (Chen)* 0.78 0.95 1.02 1.04 1.04

Table 4 Summary of RMSPE, Theil U and MSPE ratio values for Regression and Random Walk model comparison

9

In order to evaluate the out of sample performance of the forecast, three statistics the RMSPE, Theil U ratio,
and MSPE ratio, are calculated from the forecast errors of the Random Walk and ARMA model. The Theil U
statistic is defined in (4):
=
()
()
(4)
Note that MSPE ratio is simply the square of Theil U statistic (
2
= ). If the aim is to minimize
squared loss, a model with lower RMSPE is desired. From Table 4, both the Theil U and MSPE Ratio statistics
suggest that Regression model outperform Random Walk model for h=1. However, for other values of h,
Random Walk model performs equally or better than the Regression based model. The result concluded from
Chens paper is outlined in Table 4 for comparison. Note that the independent analysis and calculation
produces a MSPE ratio that is always higher than Chens result. However, the conclusion is similar enough for
the evidence to stand for h=1. Thus, the Regression model can outperform the Random Walk model.
In order to determine whether the difference of squared error for forecasts found in Table 4 is statistically
significant, heteroskedascity consistent statistical test of the constant term is conducted. The results are
summarized in the Table 5.
Constant Term Test Results
The variable is defined as the difference between the squared error of the two forecast model for a total ofP
forecast errors, seen in (5):
1+,
2

1+,
2
= +
+
. (5)
The null hypothesis will be < 0 whether the Random Walk model will outperform the Regression model.
Table 5 summarizes the coefficient estimate as well as p values for all values of h. For the value h=1, the
estimate is positive. Since the p-value is less than zero, the null hypothesis is rejected at all conventional levels
of significance. This evidence supported the finding that the Regression model can outperform the Random
10

Walk model as the difference between the squared forecast errors is significantly higher than zero. For other
values of h, the null hypothesis cannot be rejected, and coefficient estimate is less than zero. This suggests that
the Random Walk model still prevail for values of where h1.
Estimate T value p-value
h=1 2.945 4.236 0.000
h=3 -3.821 -0.936 0.825
h=6 -1.49 -0.176 0.570
h=12 -21.44 -1.05 0.852
h=24 -32.09 -1.143 0.873
Table 5 Summary of T-test coefficient estimates and its respective p-values for h=1,3,6,12,24
Table 6 summarizes the correlation and directional accuracy of Regression based model forecast as well as the
success ratio findings by Chen. If the correlation is positive, the Regression model can be used in prediction. If
the correlation is negative, the prediction is not to be trusted. From Table 6, the correlation values are all
negative except for when h=1. When h=1, the correlation value is satisfactory at 0.613. The directional accuracy
is 0.665, which is comparable to the result in Chens paper at 0.63. This suggests that the sign in the forecast
estimate can be used as a signal for buy or sell for WTI oil prices, since it is higher than 0.5 (better than coin
flip).
h=1 h=3 h=6 h=9 h=12
Reg
Model
Correlation 0.613 -0.161 -0.016 -0.195 -0.187
Directional Accuracy 0.665 0.581 0.600 0.607 0.614
Success ratio Chen* 0.63 0.57 0.54 0.53 0.53

Table 6 Summary of Correlation and Directional accuracy of actual change of Log H share price against forecasted change under VAR
model
Conclusion
In conclusion, the predictability of WTI Oil prices using Regression based model with XOI Index price when h=1
is promising, but this is not true for values of h=3,6,9, and 12. The in-sample fit is satisfactory when p-value
only equals 0.11. It is found that the Oil price appears to have a one month time trend, as seen from the
intercept value. Visualization of out-of-sample forecast error shows that the Regression based model shows a
11

marginal improvement on the Random walk model, especially for extreme values of the forecast error. The
RMSPE, Theil U, and MSPE ratio statistics also suggest that the Regression model will outperform the Random
Walk model. Statistical tests establish that difference in squared error of the Random Walk and Regression
model is statistically significantly higher than zero. The correlation of the forecast estimate of the Regression
model against the actual change is positive, which supports the estimate. The directional accuracy of 0.665 is
significantly higher than 0.5, which suggest that the sign of the forecast estimate can be used as a signal for
buying or selling WTI oil.

12

Appendix

Figure 5- Comparison of $1 invested in Oil using buy and hold strategy and using Regression model forecast estimate as signal(On/Off)

Figure 6- Comparison of $1 invested in Oil using buy and hold strategy and using Regression model forecast estimate as signal
(Long/Short)
For additional research purposes, this paper tried to take advantage of the favorable directional accuracy as a
trading signal. Two variations of using the forecast estimate as trading signal are derived. The first variation is
to buy and hold Oil for one month when the forecast estimate is positive, or set return = 0 (hold cash) for one
month whenthe forecast estimate is negative. The second variation is to buy and hold Oil for one month when
the forecast estimate is positive, or to short and hold Oil for one month when forecast estimate is negative.
Figure 5 and Figure 6 shows the comparison of $1 invested in this strategy as opposed to buying Oil from the
13

beginning, and holding it for the entire period. It appears that the first strategy does not produce higher
returns than simply buying and holding it for the entire period. Though, it does provide a lower variance for the
portfolio return. This may be due to the fact that for some months, the investment funds are kept as cash,
which meant return = 0, reducing the variance. The second strategy shows that the trading strategy of shorting
Oil when the forecast estimate is negative produced a much lower return than simply buying and holding Oil.
This may be due to the time trend of Oil which shows a constant increase throughout the period. Thus, shorting
Oil will be a negative expected return strategy and will be detrimental to investors.

Assignment 5

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Assignment 5

Enviado por

Direitos autorais:

Formatos disponíveis

0

is obtained. The time series plot of

can be observed in Figure 1, and its respective summary statistics in Table 2.

= 0.43 and that p value < 0.00.

Você também pode gostar