Você está na página 1de 7

Research Report

Stock Market Prediction using an Artificial Neural Network


Tue Hoang Vo, The University of the South
Linda Bright Lankewicz, The University of the South
August 2014 (seventh draft August 12, 2014)
Abstract
[FILL IN LAST, ALONG WITH KEY WORDS]
1.
Introduction
[NEED SOME INTRODUCTORY REMARKS CONCERNING THE USE OF DATA MINING FOR
STOCK MARKET PREDICTION]
A.

Goal of our Research

We proposed to modify the model outlined in Data Mining with R by Luis Torgo1 to achieve better results
for predicting the performance of the stock market based on time series historical data. To improve
performance, we tested a variety of technical indicators, trading policies, and time frames. Specifically,
there are four objectives that this research paper seeks to address.
1.

Improve Torgos neural network classifier for predicting whether to buy, sell, or hold stock based
upon performance.

2.

From a list of technical indicators suggested by the book and researchers, use feature selection
techniques (such as correlation and random forest) to choose input variables to maximize the return
of the output of the neural network model.

3.

Compare two time frames of training and testing data and determine which time frame produce
better results.

4.

Using Torgo's two trading policies, determine which trading policy produce better results.

B.

Assumptions

We assume that future performance can be predicted based upon historical performance during a window of
time based on technique of pattern recognition. A second assumption is each of the choices of trading
policies, of training and testing periods of time, and of technical indicators is independent of and has no
effect upon the performance of the other ones.
2.

Artificial Neural Network

Artificial neural network is an algorithm that, based upon training data, produces weighted connections
between layers of functions to predict outcome. The design contains a number of layers and a number of
nodes for each layer. Each time a set of training data enters the network, the algorithm records the predicted
target variable and the real target variable, compares the result, and, uses a function called backpropagation to the information back through the network and readjust the weights of the nodes in the layers
accordingly.
3.

Torgos Model

For this research, Torgo's model for prediction of future returns was used, with the goal of improving its
performance. A tendency indicator is used to summarize the level of change of price over a period of k
1

Lus Torgo, Data Mining with R, 2011, CRC Press.

days. Over the period, a set of values are collected for the percentage of variation of the closing prices to
the average prices for the following k days.2 The tendency indicator is the sum of the variations that are
greater than a threshold, p. Torgo states that, The value of this indicator should be related to the confidence
we have that the target margin p will be attainable in the next k days.3
Using historical data with the neural network based on the tendency indicators and technical indicators, a
neural network architecture is constructed. The neural network would output a list of a list of future
predictions of the tendency indicator for a testing period. These tendency indicators would be be translated
into trading signals (buy, hold, or sell) based upon thresholds.
For a stock market trading simulation, Torgo used two trading strategies for the predicted signal (buy, hold,
or sell) and provided final returns in percentage gained or lost (in addition to the original amount of
investment). These returns reflect of the effectiveness of the model. Our research used these same
strategies. The first strategy only allows one stock market holding position at a time, and the second
strategy allows the acquisition of a new position if there are sufficient funds for it. Also, the first strategy
limits the time waiting for the target profit, while the second strategy does not limit the wait time. See
Torgo for additional details of the trading strategies. 4
4.

Stock Market Indicators

Building upon Torgos model 5 for predicting stock market returns, we used the set of indicators from the R
programming language package TTR.6 Following Torgos example, a single value is used for each
indicator. Information about these indicators may be found in the TTR package documentation.7
ATR
SMI
ADX
Aroon
Bollinger Bands
Chaikin Volatility
CLV
EMV
MACD oscillator
MFI
SAR
Volatility

Average True Range, a moving average of the high, low, and close to measure
volatility
Stochastic Momentum Index, relative value of the close as compared to the
midpoint of the high and low
Directional Movement Index developed by Welles Wilder
Seeks to identify starting trends by measuring highs and lows during a period
of time
Compares volatility and price levels over a period of time
A measure of the rate of change of the trading range
Close Location Value, a comparison of the closing value to the trading range
Arms Ease of Movement Value, considers the effect of days when trading
moves easily
A price oscillator developed by Appel, compares moving averages
Money Flow Index, ratio of positive and negative money flow over time
Parabolic Stop-and-Reverse, calculates the point when a long position is
closed and a short position opened
Uses various volatility indicators

For our research, five additional indicators that are used by investors were considered. (include footnote)
GPI
HPI
PCE

Gold Price Index


Housing Price Index
Personal Consumption Expenditure

Torgo, pp. 99-110.


Torgo, p. 99.
4
Torgo, pp. 130-132.
3

5
6

Torgo, L. Data Mining with R, 2011, CRC Press, p. 113.


Ulrich, J. TTR: Technical Trading Rules. R package version 0.20-1. 2009.

cran.r-project.org/web/packages/TTR/TTR.pdf

COI
EFR
5.

Crude Oil Index


Effective Federal Return
Feature Selection

Our research compared three feature selection techniques to select the variables used in the artificial neural
network: random forest, correlation and information gain. Random forest was proposed by Torgo in his
financial prediction model as a method to rank features according to their importance to select the highestranking variables. The random forest attempts to construct a multitude of decision trees based on features
and outputs the best performing decision tree. Random forest also provides a ranking of features based
upon the percentage increase in the error of the forest if each feature is removed in turn.
Correlation function measures the level of dependencies between each set of two separate variables. For
this research, the paper attempts to measures the level of dependencies among features. For groups of
closely correlated features, only one for the group will be used in order to reduce noise and prevent overrepresentation in the neural network.
Information gain calculates and ranks the entropy level of each variables using discrete data. To perform
information gain ranking algorithm for the research, we discretize the continuous data using partition
clustering and run information gain on the newly created discrete data.
In addition, we will also perform a test of all variables, using no feature selection techniques.
6.

Experiments

A.

Data Pre-Processing

The analyzed data was a collection of S&P 500's historical prices from 1970 to 2009. Torgo's website
provided data in two formats. Data do not require additional treatment.
As described above, new indicators from Quandl.com were considered. The research decides to remove
House Price due to its lack of historical data from 1970 1974. Only Gold Price Index was used in the final
model. The House Price Index is not used because of its lack of historical data from 1970 to 1974. PCE,
COI and EFR were removed because they did not have daily value (only monthly, quarterly, or yearly
values).
For neural network input, we normalized all data before processing.
B.

Training and Test Data and Selecting the Time Window

This research selects two time periods for testing and training data. The first set was composed of training
data from 1970-1999 and testing data from 2000-2009. The second set was composed of training data from
1980-1999 and testing data from 2000-2009. The selection of two time periods serves to compare and
contrast the their performances. Namely, we seek to address whether more data produces better result, or
whether more recent data produce better result.
C.

Use of the Feature Selection Methods

Our research uses four approaches for feature selection methods. The first approach was to select all of our
indicators as features in the final model. The second approach was borrowed from the Torgo book, in which
variables are selected from the ranking in random forest. The third approach was the use of correlation to
reduce over-representation of highly correlated variables. The final approach was to use information gain
to rank variables and select the best performers.
We briefly considered a combination method that uses both random forest and correlation, but the results
were not promising.

D.

Selecting the architecture for the neural network

In building the architecture for a neural network, several parameters needed to be determined. We sought to
maximize the performance of ANN by selecting optimal values for a variety of parameters: the number of
nodes in the hidden layer, decay rate of the neural network and the maximum number of iterations of the
neural network. In addition, to implement Torgo's model, we need to select optimal values for the target
margin p, number of days k of tendency indicator, and the threshold for translating the output into buy,
hold, or sell order.
The precision/ recall metric is designed to evaluate the performance of a model. Precision is the proportion
of event signals produced by the models that are correct. Recall is the proportion of events occurring that
were identified as such by the models. (see Torgo's example)
In order to select the optimum value for each parameter, neural network function was run on different
values of that parameter while keeping all other parameters constant. [MAKE IT A FOOTNOTE] A
detailed report can be found in the appendix. The precision and recall metric of the produced ANN was
measured for each case. Finally, we plotted them against each other, looked at the trend and selected the
parameter with optimal precision and recall.
The research's assumption is that each of the parameter is independent of each other; so that the optimal
performance of the system is produced by aggregating the optimal performances of all parameters. The
optimal values of all parameters will be the architecture of neural network used in our following
experiments.
E.

Experimental Runs

The experiment performed a variety of tests that try to capture the change of training/ testing period, of
different method of feature selection, and of different trading policies.
Each test was performed 10 times: the first 5 times with constant seed numbers (1100, 1200, 1300, 1400,
1500) and the last 5 times with randomly chosen seed numbers.
F.

Metric for Comparing Results

Four metrics will be used to compare results: The average return of the 5 tests, the count of positive and
negative returns in those tests, and precision and recall metric.
G.

Experimental Results

The following 8 experiments were run:


1. Using Random Forest as feature selection for training/ testing set 1.
2. Using Random Forest as feature selection for training/ testing set 2.
3. Using Correlation Function as feature selection for training/ testing set 1.
4. Using Correlation Function as feature selection for training/ testing set 2.
5. Using Information Gain as feature selection for training/ testing set 1.
6. Using Information Gain as feature selection for training/ testing set 2.
7. Using all technical indicators for training/ testing set 1.
8. Using all technical indicators for training/ testing set 2.
Comparison between strategy 1 and 2
Number of positive returns Number of negative returns
Strategy 1

77

43

Average return for one year


1.572

Strategy 2

45

75

-1.565

*Note: There are no comparison of precision and recall metrics, because they are the same for two
strategies.
Comparisons between different feature selection techniques for policy 1
Number of
positive
returns

Number of
negative
returns

Average
return for one
year

Average
Precision

Average
Recall

Average P
value ( =
P*R/(P+R) )

Torgo's
model (A)

13

1.582

50.3

47

0.2429

Random
forest with
new variables
(B)

12

1.347

51.45

44.45

0.2384

All variables
(without GP)
(F)

11

1.192

51.45

47.2

0.2461

All variables
(including
new) (G)

16

2.342

51.75

47.35

0.2472

With
correlation
(H)

11

1.535

50.55

45.7

0.24

With
information
gain

14

1.434

50.15

43.7

0.2333

Comparison of time frame for policy 1


Number of
positive
returns

Number of
negative
returns

Average
return for one
year

Precision

Recall

P value ( =
P*R/(P+R) )

Time frame 1
(Training
time 19701999)

39

21

1.66

51.47

46.37

0.2438

Time frame 2
(Training
time 19801999)

38

22

1.484

50.42

45.43

0.2388

7.

Results

Comparison of parameters:
Assumption: This conclusion is drawn exclusively by the recorded data.

Comparison between strategies: Strategy 1 is doing better than strategy 2 in all metrics. Strategy 1 appears
to be a more reliable option.
Comparison among different feature selection techniques:
Data indicate that aggregating all potential variables into the model for policy 1 produces the best result in
all of the categories (at 2.342 for a year, positive returns at 80% of the time, and highest precision and
recall rate). This also beats risk-premium rate at 2.03%, lending some credibility to investing the right
amount of money. Among different feature selection techniques, different metrics indicate different
ranking: random forest ranks 1st in term of average return, correlation ranks first in term of average
precision, and information gain ranks first in term of percentage of positive return. However, in all metrics,
their performances do not deviate highly from each other. We can conclude that their performances are
approximately similar across all metrics.
Comparison between two time frames:
Two rationales can be drawn by either of the result. If time frame 1 beats time frame 2, this may well
indicate that the amount of the data matters more; if otherwise, then more recent data are better predictors
of the results. Since data show that time frame 1 performs slightly better than time frame 2 in all metrics.
The data suggest that the first conclusion is more likely, but also (from the proximity of the result of two
time frames) indicates that there are little difference in the resulting function of the neural network.
Overall conclusion:
Precision/ Recall metric: The precision and recall metrics are consistent and average. The average
precision of ANN is around 50%, and 45% for recall. While this is higher than a random choice of order
(probabilistically speaking, roughing 33% for each sell, hold and buy order) this is not high enough with
regard to the expectation of a spectacular model capable of beating the market.
Number of positive/ negative returns: The result is promising, but not definite. On the positive note, the
number of positive returns outweigh the number of negative ones in policy 1 (approximately 65% positive
and 35% negative). On the other side, 35% chance of negative return is a substantial risk for an investment.
Average return: The average return for policy 1 is approximately between 1.5 and 1.6% for a year. The
average return is promising, but it is not a great performance compared to the annual risk-premium rate of
USA (2.03 per year). *See Investment choice in reality
http://people.stern.nyu.edu/adamodar/New_Home_Page/datafile/implpr.html)
Investment choice in reality: Should we put money in this prediction model? Assuming the opportunity
cost is zero, then it makes sense to invest with expectation of 1.5% return over a year and a 35% risk of
negative return. However, any kind of investment represents a loss of opportunity due to not investing
elsewhere. In this case, assuming on the other hand, we choose to invest in a standard stock (average of the
market), then we can expect a return of risk-free 2.03 per cent a year. Thus, we are at a net loss of utility if
we put money in this prediction model.
If we choose the best of each of the parameters (using no feature selections; adding new variables; and
using trading policy 1), however, we should expect to end up with a 2.342% additional return over one year
period. In addition, our chance of having positive return is 80% and negative return is 20%. To compare it
to the choice of a risk-free 2.03 per cent, it takes further finance knowledge of the relationship between risk
and return; however, if the investor is less risk-averse, he is likely to be willing to bet his money on a
2.342% return over one year period with some degree of risk over a risk-free rate of 2.03%.
8.

Future Research

This research proposes the use of artificial neural network in stock market prediction. The result is not
perfect but promising; further research can be done in a few suggested areas. Firstly, this research only uses
a limited pool of variables, but further research can consider a larger pool. Secondly, one problem with this
model of artificial neural network is that it does not take into account the importance of time series data but
treat each observation in the same light. A more elaborate function that differentiates different observation
and gives them proportional level of importance in the neural network may be an explorable option.
Thirdly, different learning techniques apart from artificial neural network (support vector machine, C4.5,
etc) should be taken into account. Finally, a more hybrid approach which incorporates the discipline of
finance and economics can be considered for better result.
9.
References
[LATER INSERT PAPERS]

Você também pode gostar