Escolar Documentos
Profissional Documentos
Cultura Documentos
1st Angeliki Papana, 2nd Dimitris Folinas and 3rd Anestis Fotiadis
1
University of Macedonia, Greece
2
Department of Logistics, Alexander TEI of Thessaloniki, Greece
3
I-Shou University, Taiwan
1
angeliki.papana@gmail.com, 2 dfolinas@gmail.com, 3 anesfot@gmail.com
Abstract
In this study, we indicate the usefulness of time series forecasting methods on very
short data. Specifically, we apply some of the basic time series forecast methods in
order to predict the future consumption and purchase of the drug RAPILYSIN
LYPDINJ 2X1.16G/VIAL (RL). The available data are monthly measurements of the
consumption and purchase of the drug RL from the General Hospital of Katerini and
cover the period 2009-2011, i.e. three years. Tools from univariate time series
analysis and forecasting are introduced, discussed and applied based on the type of
the available data. Based on the accuracy of the forecasts, the most efficient method
is fitting a simple seasonal exponential smoothing model.
1. Introduction
A synchronized and responsive flow of products and services is the goal of
supply chain planning, while demand planning is the first step of supply chain
planning that determines the effectiveness of manufacturing and logistics
operations in the chain. A demand forecast is the prediction of the quantity of
a product or service that will be purchased. Demand forecasting is essential
for corporations and organizations such as hospitals in order to assess future
capacity requirements.
There are two approaches to determine demand forecast, i.e. the
qualitative approach and the quantitative one. Qualitative methods are usually
used at ambiguous situations and when little data exists, and require the
intuition and experience of the experts. Quantitative methods of demand
forecasting involve many techniques that incorporate the information from
past or current data, e.g. regression methods, extrapolation methods, neural
networks and data mining techniques. The statistical methods tend to be
superior in general, although there are occasions when model-based methods
are not practical. The best demand forecast may be determined using a multi-
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS
Doctors
Hospital President
Patient Allergies or
Characteristics
Ministerial Decisions
Hospital Scientific Committee
Legislations
Financial Problems
Limited IT Knowledge
Limited Managerial Knowledge
Ministerial Decisions
Limiting factors that affect the final decision of the monthly orders of
drugs may be some legislative decisions that determine what will be the
preference for a commission through the procurement system. In parallel, the
Ministry may determine some limiting factors. Recently in Greece took place
the first online auctions for substances. Therefore, the entire drug supply
system is modified, as most managers will start ordering the active
substances of the drugs and not nominally specific drugs. The economic
problems that Greece is facing nowadays, clearly influences all the decisions.
The continued need to reduce costs causes the supplement of cheaper drugs
of dubious quality. The last two years there is a constant attempt to electronic
data processing of all pharmacies and provide statistics to the health ministry
but unfortunately the older employees of the pharmacy and the fear of contact
with technology hinders the electronic operation of the pharmacy. Figure 1
displays in short the infuential and limitation factors for the final decision of the
hospital concerning the monthly orders of drugs.
In this study, we will introduce time series methods, which are suitable
for short term predictions. These methods search for patterns in the time
series and extrapolate these patterns into the future. Time-series forecasting
is a form of extrapolation in that it involves fitting a model to a set of data and
then using that model outside the range of data to which it has been fitted.
Forecasting of time series is a very difficult task as it is hard to recognize the
underlying patterns and relationships due to noise and random and
unexpected changes.
e.g. irregular growth. The time series decomposition adjusts the seasonality
by multiplying the normal forecast by a seasonal factor. Another method of
short-term forecasting is the use of a Z-Chart. It is assumed that basic
principles that dominate the data do not alter, or alter on anticipated course
and that any underlying trends at present will continue. More complex
nonlinear methods are also develop for time series prediction, however these
methods usually require larger data sets as they have more free parameters
for their estimation.
The minimum time series length one needs to make ‘good’ forecasts
using a statistical model depends on the type of the model and the amount of
random variation in the data. From a purely statistical point of view, it is
always necessary to have more observations than parameters. The minimum
requirements apply when the amount of random variation in the data is very
small. Real data often contain a lot of random variation and therefore sample
size requirements increase accordingly. Therefore, the number of available
data affects the choice of the corresponding forecasting method.
In order to be able to decide which forecast method is the best for each
data set, one should know and understand the different types of methods and
recognize the different components in the data. However, one should always
validate the forecasts. In order to check the accuracy of the forecasts and the
fact that are unbiased and efficient, we need to measure the prediction error,
i.e. the difference between the actual time-series and the forecasts. For this
purpose, many statistical measures have been developed such as the mean
square error, root mean squared error, cumulative forecast error, mean
absolute percent error, etc. Therefore, we display the original and forecasted
values of each method for the three years in order to see their performance.
Figure 2: The time plots of (a) the consumption and (b) the purchase of the drug RL
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS
(a)
12000
10000
8000
consumption
6000
4000
2000
0
0 5 10 15 20 25 30 35
months
4 (b)
x 10
3
2.5
2
purchase
1.5
0.5
0 5 10 15 20 25 30 35
months
An important issue of time series analysis is the stationarity of the data.
A stationary time series is one whose statistical properties such as mean,
variance, autocorrelation, etc. are constant over time. Most statistical
forecasting methods are based on the assumption that the time series are
stationary or can become stationary using mathematical transformations. In
order to test the stationarity of the data, we implemented the Augmented
Dickey-Fuller test (Dickey & Fuller, 1979) which indicated the rejection of the
unit-root null hypotheses in favour of the alternative one, i.e. suggested that
both time series are stationary. Therefore, we do not need to transform the
original time series.
Let us denote as {xt}, t=1,…,N the observed time series. The sample
autocovariance coefficient at lag k=0,1,2.. is
N −k
ck = ∑ ( xt − x )( xt + k − x ) / N
t =1
values exist, for lag 6 and lag 8, while for the purchase of the drug only for lag
3 is rk significantly different from zero. From the two correlograms we can
conclude that the data are stationary and present no trend. The two time
series may be random as only 1 and 2 significant rk values exist, respectively,
however time series may also be characterized by seasonal fluctuations, and
therefore the correlogram is also exhibiting oscillations at the same frequency.
If the series are truly random, then only an occasional autocorrelation should
be larger than two standard errors in magnitude. The interpretation of a
correlogram is a difficult task, especially when N is so small.
Figure 3: The correlogram of (a) the consumption and (b) the purchase of the drug
RL
(a)
1
sample autocorrelation
0.5
-0.5
0 2 4 6 8 10
lags
(b)
1
sample autocorrelation
0.5
-0.5
0 2 4 6 8 10
lags
In order to decide whether there is a cyclic component in our data, we
use the seasonal subseries plots (Cleveland, 1993), which is a tool for
detecting seasonality in a time series. This plot is only useful if the period of
the seasonality is already known. Since our data are monthly, the period is
considered to be 12. The seasonal subseries plots of the consumption and
the purchase of the drug are displayed in Figure 4. From the plots, no
apparent seasonality is observed for the two variables.
Figure 4: The seasonal subseries plots of (a) the consumption and (b) the purchase
of the drug RL
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS
(a)
10000
8000
consumption
6000
4000
2000
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Noe Dec
months
4 (b)
x 10
2.5
2
purchase
1.5
0.5
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Noe Dec
months
∑e 2
t
MSE = t =1 .
N
The first time series forecasting method introduced here is the simple
moving average. This method is suitable for data that present no trend,
seasonality or cyclic components. The forecasts are estimated as the mean of
the K previous values of the time series
1 t −K
Ft = ∑ Xi .
K i =t −1
Table 1: MSE from the simple moving average method for the consumption and the
purchase of the drug RL for K=3, 4, 5, respectively
Figure 5: Plots of (a) the consumption and (b) the purchase of the drug RL and their
fitted values from simple moving average method for K=3
(a)
consumption
10000
forecast
8000
6000
4000
2000
0
0 5 10 15 20 25 30 35
month
4 (b)
x 10
3
purchase
forecast
2.5
1.5
0.5
0
0 5 10 15 20 25 30 35
month
The simple exponential smoothing method takes into account all
previous observations but gives greater weight to more recent observations.
This method is again suitable for data with no trend or seasonality. The
forecast is estimated from the equation
Ft+1 = αXt + α (1-α) Xt-1 + α (1-α)2 Xt-2 + … + α (1-α)m Xt-m + (1-α)m+1 Ft-m,
consumption
10000
forecast
8000
consumption
6000
4000
2000
0
0 5 10 15 20 25 30 35
months
(b)
consumption
10000
forecast
8000
consumption
6000
4000
2000
0
0 5 10 15 20 25 30 35
months
The Random Walk model, Yt = Yt-1 + εt, predicts that the value at time
"t" will be equal to the last period value plus a stochastic (non-systematic)
component that is a white noise, which means εt is independent and
identically distributed with mean zero and variance σ². The forecasting model
suggested is Yt - Yt-1 = εt or Yt - Yt-1 = α, where alpha is the mean of the first
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS
differences, i.e. the average change from one period to the next. If we
rearrange this equation, we get Yt = Yt-1 + α. In other words, we predict that
this period's value will equal last period's value plus a constant representing
the average change between periods. Therefore, the random walk model
assumes that from one period to the next, the original time series merely
takes a random "step" away from its last recorded position. The ‘best’ forecast
of the next value is the same as the most recent value. This is a very simple
method, but is often quite sensible, and has been widely applied to economic
data even though one may expect that more complicated methods will
generally be superior. The means of the first differences of the consumption
and of the purchase of RL are 79.1003 and 0, respectively. Therefore, the two
forecast models at each case are Yt = Yt-1 + 79.1003 and Yt = Yt-1,
respectively. The MSE for the two data sets are 1.2992.107 and 1.6029.108,
respectively. The original and fitted values from the random walk model are
displayed in Figure 7.
Figure 7: Plots of the (a) consumption and (b) purchase of the drug RL and their
fitted values from the random walk model, respectively
(a)
12000
consumption
forecast
10000
8000
consumption
6000
4000
2000
0
0 5 10 15 20 25 30 35
months
4 (b)
x 10
3
purchase
forecast
2.5
1.5
0.5
0
0 5 10 15 20 25 30 35
months
The next method is to fit to the data an Auto-Regressive model of order
p, denoted as AR(p). The general form of the AR(p) model is Xt = φ0 + φ1Xt−1
+ φ2Xt−2 + · · · + φpXt−p + Zt. Thus the value at time t depends linearly on the
last p values and the model looks like a regression model. The order of the
AR model is selected using the partial auto-correlation function or the Akaike
information criterion (Akaike, 1974). We will implement here the simplest
example of an AR(p) process, i.e. the AR(1) model Xt = φ0 + φ1Xt−1 + Zt. The
AR model is fitted by least squares regression to find the values of the
parameters for each data set which minimize the error term. The estimated
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS
coefficients of the AR(1) models from fitting the data are φ0=368.202,
φ1=0.171 and φ0=1103.91, φ1= 0.168, respectively. The MSE values for the
two data sets are 5.7821.106 and 6.1048.107. The original and fitted values
from the AR(1) model are displayed in Figure 8.
Figure 8: Plots of the (a) consumption and (b) purchase of the drug RL and their
fitted values from the AR(1) model, respectively
(a)
consumption
10000
forecast
8000
6000
4000
2000
0
0 5 10 15 20 25 30 35
month
4 (b)
x 10
3
purchase
forecast
2.5
1.5
0.5
0
0 5 10 15 20 25 30 35
month
Finally, the best model to fit the data is found to bee the simple
seasonal exponential smoothing model. This model is appropriate for series
with no trend and a seasonal effect that is constant over time. As the data are
monthly, the number of periods in a seasonal interval is p = 12. Simple
seasonal exponential smoothing has two parameters, the level parameter L(t)
and the season parameter S(t)
L(t ) = a ( X (t ) − S (t − s )) + (1 − a) L(t − 1)
S (t ) = δ ( X (t ) − L(t )) + (1 − δ ) S (t − s )
Xˆ (k ) = L(t ) + S (t + k − s )
t
seasonal exponential smoothing model, and the forecasted values for the next
six months.
Figure 9: Plots of the (a) consumption and (b) purchase of the drug RL, their fitted
values from the simple seasonal exponential smoothing model and their forecasts for
the next six months, respectively.
(a)
(b)
4. Conclusions
This work concentrates on finding ‘best’ point forecasts using MSE. Although
the available data are so few, we could still find a model that seems to able to
simulate the oscillations of the original data. More advanced forecast methods
cannot be used when the available data are so few. However, simple methods
have proved to be better that more advanced ones at cases. In order to
evaluate and compare the forecast methods, the easier way is to only
compare the accuracy of the method, based on the fitted values of each
method/ model.
In practice, different statistical measures for forecast accuracy may
give different results. Therefore, it is important to check which method each
statistical measure suggests and whether there is ‘significant’ difference
among the methods. This work concentrates on finding ‘best’ point forecasts
using MSE. In practice, we often need to produce interval forecasts, in order
to better assess future uncertainty.
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS
References
Akaike H. A new look at the statistical model identification. IEEE Transactions on
Automatic Control 19 (6), 716–723, 1974.
Brown, R.G. Exponential Smoothing for Predicting Demand. Cambridge,
Massachusetts: Arthur D. Little Inc. pp. 15, 1956.
Box G.E.P. and G. Jenkins. Time Series Analysis: Forecasting and Control. Holden-
Day, 1976.
Cleveland W.S. Visualizing Data, Hobart Press, 1993.
Dickey D.A. and W.A. Fuller. Distribution of the estimators for autoregressive time
series with a unit root. Journal of the American Statistical Association 74, 427–
431, 1979.
Huang C. and H. Yang. A Time Series Approach to Short Term Load Forecasting
through Evolutionary Programming Structures. Proceedings of the International
Conference on Energy Management and Power Delivery (EMPD'95), Vol. 2,
583-588, 1995.