Você está na página 1de 8

Time series analysis of gold production in Malaysia

Nora Muda and Lee Yuen Hoon


Citation: AIP Conf. Proc. 1450, 125 (2012); doi: 10.1063/1.4724127
View online: http://dx.doi.org/10.1063/1.4724127
View Table of Contents: http://proceedings.aip.org/dbt/dbt.jsp?KEY=APCPCS&Volume=1450&Issue=1
Published by the American Institute of Physics.

Related Articles
On convergent series representations of Mellin-Barnes integrals
J. Math. Phys. 53, 023508 (2012)
Linear differential equations and multiple zeta-values. III. Zeta(3)
J. Math. Phys. 53, 013507 (2012)
Noise reduction by recycling dynamically coupled time series
Chaos 21, 043110 (2011)
q2-Kamp de Friet series and sums of continuous dual q2-Hahn polynomials
J. Math. Phys. 52, 063519 (2011)
Communication: Decoherence in a nonequilibrium environment: An analytically solvable model
J. Chem. Phys. 133, 241101 (2010)

Additional information on AIP Conf. Proc.


Journal Homepage: http://proceedings.aip.org/
Journal Information: http://proceedings.aip.org/about/about_the_proceedings
Top downloads: http://proceedings.aip.org/dbt/most_downloaded.jsp?KEY=APCPCS
Information for Authors: http://proceedings.aip.org/authors/information_for_authors

Downloaded 08 Jun 2012 to 202.185.32.2. Redistribution subject to AIP license or copyright; see http://proceedings.aip.org/about/rights_permissions

Time Series Analysis of Gold Production in Malaysia


Nora Muda and Lee Yuen Hoon
School of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia,
43600 Bangi, Selangor, Malaysia.
Abstract. Gold is a soft, malleable, bright yellow metallic element and unaffected by air or most reagents. It is highly
valued as an asset or investment commodity and is extensively used in jewellery, industrial application, dentistry and medical
applications. In Malaysia, gold mining is limited in several areas such as Pahang, Kelantan, Terengganu, Johor and Sarawak.
The main purpose of this case study is to obtain a suitable model for the production of gold in Malaysia. The model can also
be used to predict the data of Malaysias gold production in the future. Box-Jenkins time series method was used to perform
time series analysis with the following steps: identication, estimation, diagnostic checking and forecasting. In addition,
the accuracy of prediction is tested using mean absolute percentage error (MAPE). From the analysis, the ARIMA (3,1,1)
model was found to be the best tted model with MAPE equals to 3.704%, indicating the prediction is very accurate. Hence,
this model can be used for forecasting. This study is expected to help the private and public sectors to understand the gold
production scenario and later plan the gold mining activities in Malaysia.
Keywords: Gold Production; Box-Jenkins time series method
PACS: 02.70.Rr, 05.45.Tp

INTRODUCTION
4739

4000

4289

4231

4250

3965

3000

2913

2794

2000

2490

1000

Total gold production (kg)

3497

Gold is an element of the yellow metal, easily malleable


and is not affected by the air and a variety of corrosive
agents. It is available in its original form or in combination with other elements such as silver. It is suitable for
use as a capital asset or investments, as well as used in
the elds of gold manufacturing, industrial manufacturing, medical and dental.
In Malaysia, gold mining is limited to several areas, namely the state of Pahang, Kelantan, Terengganu
and Johore such as in Tenggaroh (Johore), Tarom River
(Terengganu), Kincir River and Batu Hitam (Pahang).
Penjom, Kuala Lipis located in Pahang is the largest gold
mining center in Malaysia. It produce almost 97% of
gold production in Malaysia. Based on the geochemical
surveys conducted by the Minerals and Geoscience Department Malaysia, they were expecting River Tingkayu,
Kunak in Sabah also has the potential to become a new
gold mining site (Malaysian Minerals Yearbook [1]).
Based on Figure 1, the total of Malaysias gold production in 2001 was recorded as 3,965 kg and has increased
to 4,739 kg in 2003. However, the production has decreased to 4,231 kg in 2004. In 2005, the production increased to 4,250 kg, but continued to decline from 2006
to 2008. In 2009, the gold production has increased to
2,794 kg, and it was predicted to increased in the future.
In order to investigate whether the gold production will
be increased or not, further analysis is needed. Therefore,
a time series modeling has been carried out to make a
forecasting. Several research has been done on gold production and gold price, such as study by Rockerbie [2]

2001

2002

2003

2004

2005

2006

2007

2008

2009

Year

FIGURE 1. Total Gold Production in Malaysia from Year


2001 to Year 2009

who studied on gold production and gold prices in South


Africa. Thus, the studies on gold production in Malaysia
is necessary to look at, in order to get an overview of gold
mining scenarios in Malaysia.
Based on the studies from Shahabi et al. [3], Nadeem
et al. [4], Rachana Wankhade et al. [5] and Rahman [6],
the time series model can be adapted to analyze the gold
production in Malaysia and furthermore to forecast. Research on global gold market and gold prices by Shaee

The 5th International Conference on Research and Education in Mathematics


AIP Conf. Proc. 1450, 125-131 (2012); doi: 10.1063/1.4724127
2012 American Institute of Physics 978-0-7354-1049-7/$30.00

125
Downloaded 08 Jun 2012 to 202.185.32.2. Redistribution subject to AIP license or copyright; see http://proceedings.aip.org/about/rights_permissions

and Topal [7] also used the time series analysis, which
then predict the effect of world oil price and world gold
price by the econometric model. Selvanathan and Selvanathan [8] also studied the effect of gold prices on the
gold production and analyzed using the time series analysis.
In this study, the objective of the study is to obtain
an appropriate time series model for forecasting the gold
production in Malaysia. The Malaysias gold production
data from 2001 to 2008 on a monthly basis were used to
get the time series model and validated the model with
data in 2009.

DATA

TABLE 1.

Level of accuracy for MAPE test

MAPE value

Level of accuracy

MAPE 10%
10% < MAPE 20%
20% < MAPE 50%
50% MAPE

Very accurate
Accurate
Medium
Less accurate

of the model is then be tested by the Ljung-Box goodness of t test to check if the model is appropriate or
not. The test statistic which is approximately distributed
as chi-square with K p degrees of freedom is used as
follow:
QLB = T (T + 2)

The Malaysias gold production data from 2001 to 2009


on a monthly basis were used in the analysis. The
data were obtained from the Malaysian Mining Industry Book, published by the Department of Minerals and
Geoscience Malaysia (JMG).

METHODOLOGY
The Box-Jenkins time series model was used to perform
the time series analysis. Four steps of Box-Jenkins model
have been used: the identication of the model, parameter estimation in the model, the diagnostic checking to
determine the suitability of the selected model and the
prediction of a time series value.

The Box-Jenkins Time Series Model


In the Box-Jenkins model, the data should be stationary. The determination of the stationarity can be investigated by the plot of correlogram, which is autocorrelation function (ACF) graph or partial autocorrelation
(PACF) graph and perform the unit root tests; the Augmented Dickey Fuller test (ADF). The identication of
the Box-Jenkins model can be identied by observing the
ACF or PACF graph. After the estimation of the parameter of an identied model, the suitability of the BoxJenkins model is investigated. The residual analysis is
performed to check the suitability of the model as shown
below for the ARIMA (p,d,q) model.

t = yt ( + i yti i ti )
If the specied model is adequate and hence the appropriate orders p and q are identied, it should transform the observations to a white noise process. Thus, the
residuals should behave like white noise. The adequacy

( T k )rk2

T =1

The best tted model is then used for forecasting.

Accuracy Test
The accuracy of prediction or forecasting can be tested
by the mean absolute percentage error (MAPE). According to Lewis [9], the level of accuracy for the MAPE test
is divided into four stages as shown in Table 1. Each level
of accuracy gives the percentage of the accuracy of predicted value compared to the original time series value.

RESULTS
The analysis of Box-Jenkins time series model has been
carried out using Eviews 7, Minitab and R softwares.
The correlogram and the Augmented Dickey Fuller test
(ADF) has been examined to check the stationarity of the
time series data. This is to make sure the assumption of
stationarity is fullled.
The plot of time series of gold production in Malaysia
from 2001 to 2008 is depicted in Figure 2. Based on
Figure 2, the total of gold production in Malaysia has
started to decline after 2003. Clearly the time series data
has shown the non seasonal pattern and has signicantly
changes in the volatility of the gold production. Therefore, the time series data were not meet the assumption
of stationary. This can be validated by the ACF graph
in Figure 3, whereby the sample autocorrelation function
were slowly decreased to zero. Thus, the transformation
is needed to overcome this problem as to make sure the
data meet the assumption of stationarity before further
analysis could be done.
The simple transformation of rst differentiation on
the gold production has been computed by the equation
of:
zt = yt yt1

126
Downloaded 08 Jun 2012 to 202.185.32.2. Redistribution subject to AIP license or copyright; see http://proceedings.aip.org/about/rights_permissions

1e+05
5e+04
0e+00
1e+05

5e+04

Total gold production

Total gold production

200000 250000 300000 350000 400000 450000


0

20

40

60

80

100

20

40

Time

80

100

FIGURE 4. Time series plot of gold production in Malaysia


from 2001 to 2008 after the rst differentiation

0.2

0.4
0.2

0.0

0.0

0.2

0.2

0.4

ACF

0.6

0.6

0.8

0.8

1.0

1.0

FIGURE 2. Time series plot of gold production in Malaysia


from 2001 to 2008

ACF

60
Time

10

15

20

FIGURE 3.

10

15

20

Lag

Lag

FIGURE 5. ACF graph for gold production data after the rst
differentiation

ACF graph for gold production data

which zt is a time series data after the rst differentiation, and yt is an original gold production values with
t = 2, ..., 96 . The transformation data was then plotted
and from the Figure 4 and Figure 5, the transformed data
had shown the stationary behavior; with mean and variance constant across the time. Similar results found in
PACF graph depicted in Figure 6, whereby the partial
autocorrelation values truncated faster after the lag-1.
Beside that, the assumption of stationarity could be
checked from the unit root test; the Augmented Dickey

Fuller (ADF) test, where the test hypothesis for ADF is


Ho : = 0 (yt not stationary) vs H1 : < 0 (yt stationary)
Table 2 and Table 3 show the outputs of Eviews for
ADF test.
From Table 2, it shows that the t-value is -0.017485
and greater than critical value, therefore the null hypothesis failed to be rejected, and concluded that the data were
not stationary. While in Table 3, the data is determined to

127
Downloaded 08 Jun 2012 to 202.185.32.2. Redistribution subject to AIP license or copyright; see http://proceedings.aip.org/about/rights_permissions

TABLE 2.

ADF test results before transformation


Variable

t-statistics value

p-value

Lag

Gold production, yt

-0.017485
3.503879
2.893589
2.583931

0.9540

TABLE 3.
Variable

Nevertheless, further analysis needs to be done on


this model, as we found that the absolute value of
z
= 0.4423 is smaller than 2, concluded that
sz
nb+1
the constant in the Box-Jenkins model ARIMA (2,1,1)
can be dropped.

Parameter estimation in ARIMA(2, 1, 1)

ADF test results at rst differentiation


t-statistics value

p-value

Lag

-7.336278
3.503879
2.893589
2.583931

0.0000

yt

Table 4 shows the result of parameter estimation in


ARIMA (2, 1, 1). The test has been carried out to determine whether the parameter of 1 ,2 and 1 need to be
included in the model. The hypothesis for this test is as
follow:

be stationary at the rst stage of differentiation as the tvalue -7.336278 is smaller than critical value and decided
that the null hypothesis can be rejected, and concluded
that the time series data already met the assumption of
stationarity. Then, to identify the Box-Jenkins model, the
ACF and PACF graph were referred. Based on Figure 5
and Figure 6, we can recognized that the Box-Jenkins
model for gold production is consisted of moving average component of MA (1), autoregressive component,
AR (2) and the integration of one.
Therefore, the Box-Jenkins model for the total gold
production has been identied as ARIMA (2, 1, 1). The
model ARIMA (2,1,1) can be written as
zt = + 1 zt1 + 2 zt2 + at 1 at1

H0 : 2 = 0 vs H1 : 2 = 0
H0 : 1 = 0 vs H1 : 1 = 0
From Table 4, it shows that the t-value of parameter 1
and 2 are small and the p-values respectively, 0.994 and
0.281, are greater than = 0.05. Therefore, we failed
to reject H0 and concluded that parameter 1 and 2
should be dropped from the model. While, the t-value
of 1 is large and the p-value is smaller than = 0.05,
therefore H0 can be rejected. Thus, parameter 1 need to
be included in the model.
Based on the PACF graph in Figure 6, the graph shows
the existence of autoregressive component AR(p)in the
ARIMA model, therefore we need to estimate again the
parameter of 1 and 2 although it can dropped from the
ARIMA (2,1,1).

0.2

Parameter estimation in ARIMA (1, 1, 1)

0.1

zt = 1 zt1 + at 1 at1

0.2

0.0

0.1

The estimation is repeated by including the autoregressive component of AR (1) and MA(1) in the ARIMA
model. The constant in the model is dropped as it was
not signicant. The ARIMA (1,1,1) is written as

The test is carried out to determine whether the parameter of 1 and 1 should be included in the model. The
hypothesis for this test is:

0.3

Partial ACF

H0 : 1 = 0 vs H1 : 1 = 0

10

15

H0 : 1 = 0 vs H1 : 1 = 0

20

H0 : 1 = 0 vs H1 : 1 = 0

Lag

FIGURE 6. PACF graph for gold production data after the


rst differentiation

Based on Table 5, the t-value for the parameter 1 is


smaller than the critical value and the p-value 0.435 is

128
Downloaded 08 Jun 2012 to 202.185.32.2. Redistribution subject to AIP license or copyright; see http://proceedings.aip.org/about/rights_permissions

Parameter estimation in ARIMA(2, 1, 1)

Type of model

parameter

Standard deviation

t-value

p-value

AR(1)
AR(2)
MA (1)

0.0135
-0.1521
0.5955

0.1929
0.1402
0.1765

0.07
-1.08
3.37

0.994
0.281
0.001

Parameter estimation of ARIMA(1, 1, 1)


Parameter

Standard deviation

t-value

p-value

0.1249
0.7145

0.1594
0.1136

0.78
6.29

0.435
0.000

Series fit

0.1

0.2

100000

Partial ACF

greater than = 0.05, therefore H0 is failed to be rejected and 1 should be dropped from the model. While,
the t-value of parameter 1 is large and the p-value is
smaller than = 0.05, therefore we managed to reject H0 and included parameter 1 in the model. Again,
from Figure 6, the autoregressive component is existed
in ARIMA model, therefore we need to repeat the estimation and the testing.

50000

AR(1)
MA (1)

Type of model

residuals

TABLE 5.

0.0

TABLE 4.

zt = 1 zt1 + 2 zt2 + 3 zt3 + at 1 at1

0.2

50000

Autoregressive model, AR (3) has been selected and


included in the ARIMA model. The ARIMA (3,1,1,) is
written as

0.1

Parameter estimation in ARIMA (3,1,1)

20

60
Time

Parameter of 1 ,2 ,3 and 1 need to be estimated and


to be tested. The hypotheses test involved are:

FIGURE 7.

100

10

15

20

Lag

Residual plot and RPACF graph

H0 : 1 = 0 vs H1 : 1 = 0
H0 : 2 = 0 vs H1 : 2 = 0
H0 : 2 = 0 vs H1 : 3 = 0
H0 : 1 = 0 vs H1 : 1 = 0
From Table 6, it shows that the parameter of 1 ,2 and
3 has small p-value; smaller than = 0.05, therefore
we managed to reject H0 and included parameter 1 ,2
and 3 into the model. Besides, parameter 1 also has
smaller p-value and managed to reject H0 . Thus, we
can say that the time series model of gold production in
Malaysia is best tted in ARIMA (3, 1, 1).
The ARIMA (3, 1, 1) for time series data for gold
production in Malaysia can be written as
zt = 1 zt1 + 2 zt2 + 3 zt3 + at 1 at1
zt = 0.991at1 1.441zt1 0.741zt2 0.302zt3 + at

Recalled that an appropriate model must comply with


the assumptions of stationary, random errors are not
correlated and residuals are normally distributed with
mean zero and constant variance. Therefore, the diagnostic checking of the tted model should be investigated.
Based on the residual plot and residual partial autocorrelation function graph (RPACF) in Figure 7, it shows that
all the residuals are located in the interval. This shows
that it follows the white noise model, ie random errors
are not correlated. Therefore, the assumption of errors
are randomly distributed are met.
Then, the assumption of normality is checked with
normal probability plot of residuals as shown in Figure
8. Based on Figure 8, it shows that the points of residuals
are appeared in a straight line. It means that the residuals
in the model are normally distributed.
Therefore, the ARIMA model (3, 1, 1) met the model
assumptions with random errors are not correlated and

129
Downloaded 08 Jun 2012 to 202.185.32.2. Redistribution subject to AIP license or copyright; see http://proceedings.aip.org/about/rights_permissions

TABLE 6.

Parameter estimation of ARIMA(3, 1, 1)

Type of model

Parameter

Standard deviation

t-value

p-value

AR(1)
AR(2)
AR(3)
MA(1)

-1.4412
-0.7408
-0.3019
-0.9910

0.0981
0.1610
0.0993
0.0035

-14.69
-4.60
-3.04
-285.53

0.000
0.000
0.003
0.000

350000
300000

Total gold production (kg)

200000

50000

250000

50000

400000

100000

450000

Normal QQ Plot

20

40

Residuals

FIGURE 8.

60

80

100

Time

FIGURE 9. Time series plot of gold production in Malaysia


from 2001 to 2009

Normal probability plot

errors are normally distributed with mean zero and


constant variance. In addition, the Ljung-Box test is
also used to determine the suitability of the model. The
hypothesis for the Ljung-Box is as follow:
H0 : ARIMA(3,1,1) is appropriate
H1 : ARIMA(3,1,1) is not appropriate
From the Ljung-Box test output as in Table 7, the chisquared, Q is less than the chi-squared table and the pvalue is greater than = 0.05 , therefore H0 is failed
to be rejected and we can concluded that this model is
appropriated to be used for forecasting.

Forecasting
Once the parameters of the model has been estimated
and the model has met the assumptions through a diagnostic checking, then the model can be used for prediction. The prediction were made with the 95% condence
interval. Table 8 shows the prediction values of total gold
production in Malaysia for 2009.
From Figure 9, we found that the time series plot for

gold production in Malaysia which using the predicted


value in 2009 has shown similar pattern as in Figure 2.
It shows that the predictions or the forecast values are
almost identical to the real value of gold production in
2009.
To determine the level of accuracy of the predicted
values, the test of mean absolute percentage error
(MAPE) is carried out and the MAPE value is 3.704%.

MAPE =

At Ft
At | 100

12

= 3.704%

According to Lewis (1982), the accuracy of the prediction of gold production for 2009 can be concluded as
very accurate and the tted model of ARIMA (3,1,1) is
the best tted model to be used for forecasting the gold
production in Malaysia.

CONCLUSION
The time series data of gold production in Malaysia do
not have a specic pattern, ie unaffected by seasonality

130
Downloaded 08 Jun 2012 to 202.185.32.2. Redistribution subject to AIP license or copyright; see http://proceedings.aip.org/about/rights_permissions

TABLE 7.

Results of Ljung-Box test

Lag

Chi-squared,Q

Chi-squared table,2,0.05

Degrees of freedom

p-value

12
24
36
48

8.4
22.6
28.8
46.4

15.507
31.410
50.998
65.171

8
20
32
44

0.399
0.309
0.630
0.373

TABLE 8.
Month

Forecasting of gold production in 2009


Original value, A

Forecast value,Ft

213802
210751
186128
238521
235890
211517
232295
255512
289589
238503
225520
256139

186396
190545
205771
191171
229097
215034
229289
224178
245069
261124
257457
240138

Condence interval 95%


Lower limit

January
February
March
April
May
June
July
August
September
October
November
December

factors. Although there existed changes in volatility in


the gold production as we can see from the plot, it is not
due to the seasonality factors and may be caused by other
factors.
We have found that the time series model that have
been identied for the data of gold production in
Malaysia is identied as an ARIMA model (3,1,1). The
model equation can be written as:
zt = 0.991at1 1.441zt1 0.741zt2 0.302zt3 + at .
Furthermore, the investigation on the diagnostic checking of the model shows that this model is suitable and can
be used for prediction. The MAPE test on the data predicted, give the value of 3.704%. This value shows that
the prediction values of gold production in Malaysia by
using the ARIMA (3,1,1,) is very accurate. In addition,
from Figure 9, we can predict that there will be increasing in the gold production in the next 2009. This ndings
will help the private and public sectors to understand the
gold production scenario and later plan the gold mining
activities in Malaysia.

111552
90066
90086
72542
71162
55767
56185
41261
42469
28251
30048
16253

Upper limit
261241
260888
274142
278808
295507
294358
310323
309268
323946
322160
336483
334106

REFERENCES
1. Malaysian Minerals Yearbook. Jabatan Mineral dan
Geosains Malaysia,Kuala Lumpur, 2008.
2. D.W. Rockerbie, Resources Policy, 25(2), 6976 (1999).
3. R. Sh. Shahabi, R. Kakaie, R. Ramzani and L. Agheli,
Journal of Geology and Mining Research, 1(1), 1924
(2009).
4. S. Nadeem, S. Asif, Z. Muhammad and M. B. Tariq,
International Journal of Agriculture & Biology, 2(4),
352353 (2000).
5. W. Rachana, M. Suvarna, G. Sonal and V.M. Bodade,
International Review of Business and Finance, 2(1),
97102 (2010).
6. N. M. F. Rahman, Journal of the Bangladesh Agricultural
University, 8(1), 103112 (2010),.
7. S. Shaee and E. Topal, Resources Policy, 35, 178189
(2010).
8. S. Selvanathan and E.A. Selvanathan,Resources Policy, 25,
265275 (1999).
9. C. D. Lewis, International and Business Forecasting
Methods. Butterworths, London, 1982.

ACKNOWLEDGMENTS
The nancial support received in the form of a research
grant scheme Code: UKM-DIPM-082-2011 is acknowledged. Authors appreciate the constructive advice of Assoc. Prof. Dr. Roslinda Mohd Nazar.

131
Downloaded 08 Jun 2012 to 202.185.32.2. Redistribution subject to AIP license or copyright; see http://proceedings.aip.org/about/rights_permissions

Você também pode gostar