Você está na página 1de 13

Introduction

Each year, the field of computer science becomes more sophisticated as new types of technologies hit the market. Despite that, the problem of developing intelligent agents that will precisely simulate human brain activity is still unsolved. One of the most prominent models of intelligent agents built in computer memory is represented by neural networks (NN). Thus in this report, we will be introducing the basics of NN, alongside with the prediction pattern that can be successfully used in different types of "smart" applications. Specifically, a financial predictor based upon neural networks will be explored. Its surprising how correctly constructed artificial neural network (specifically feed-forward network) can predict values, according to those specified at the input. This "forecasting" capability makes them a perfect tool for several types of applications:

Function interpolation and approximation Prediction of trends in numerical data Prediction of movements in financial markets All the examples are actually very similar, because in mathematical terms, we are trying to define a prediction functionF(X1, X2, ..., Xn), which according to the input data (vector [X1, X2, ..., Xn]), is going to "guess" (interpolate) the output Y. The most exciting domain of prediction lies in the field of financial market. An investment strategy based on computer intelligence sounds like a very prominent and interesting field of study. Next, we are going to describe a relatively simple program which will attempt to predict S&P500, DOW, NASDAQ Composite indexes, and Prime Interest Rate, according to the input data which will be described shortly. Its a very basic program trying to understand how a neural network works.

Background
The data that will be feed to neural network at the input, represents historical data of the S&P500, DOW, NASDAQ Composite and Prime Interest Rate. In general terms, these are leading indicators of stock market activity, which have a common fluctuation pattern.

S&P500
The S&P500 is a free-float capitalization-weighted index published since 1957 of the prices of 500 largecap common stocks actively traded in the United States. The stocks included in the S&P500 are those of large publicly held companies that trade on either of the two largest American stock market companies; the NYSE Euronext and the NASDAQ OMX. Actually, the S&P500 is one of the most widely followed indexes of large-cap American stocks. It is considered a bellwether for the American economy, and is included in the Index of Leading Indicators. S&P500 index fluctuations are dependent upon a lot of factors, thus the entire prediction pattern is very complex. In this application, the input data is represented only by historical items of 4 important economical indicators. It is essential to mention that if we want a better predictor, we should feed your neural network with more indicators that are more or less important for the entire interpolation.

As we can see in Figure 1, the value of the S&P500 has generally increased over time, having a significant decrease in year 2000-2005. 2

Dow Jones Industrial Average


The Dow Jones Industrial Average (DJIA), also referred to as the Industrial Average, the Dow Jones, the Dow 30, or simply the Dow, is a stock market index, and one of several indices created by Wall Street Journal editor and Dow Jones & Company co-founder Charles Dow. It is an index that shows how 30 large, publicly owned companies based in the United States have traded during a standard trading session in the stock market. Along with the NASDAQ Composite, the S&P500 Index, and the Russell 2000 Index, the Dow is among the most closely watched benchmark indices tracking targeted stock market activity. To calculate the DJIA, the sum of the prices of all 30 stocks is divided by a Divisor, the Dow Divisor. The divisor is adjusted in case of stock splits, spinoffs or similar structural changes, to ensure that such events do not in themselves alter the numerical value of the DJIA.

NASDAQ Composite
The NASDAQ Composite is a stock market index of the common stocks and similar securities listed on the NASDAQ stock market, meaning that it has over 3,000 components. It is highly followed in the U.S. as an indicator of the performance of stocks of technology companies and growth companies. Since both U.S. and non-U.S. companies are listed on the NASDAQ stock market, the index is not exclusively a U.S. index.

Prime Interest Rate


Prime rate, or Prime Lending Rate, is a term applied in many countries to a reference interest rate used by banks. The term originally indicated the rate of interest at which banks lent to favored customers, i.e., those with high credibility, though this is no longer always the case. Some variable interest rates may be expressed as a percentage above or below prime rate. Generally, prime interest rate is a significant determinant in the world of financial marketing. This is because monetary policy is aimed at influencing domestic interest rates, which drive currency rates relative to other currencies with different interest rates. Domestic interest rates also influence overall economic activity, with lower interest rates typically stimulating borrowing, investment, and consumption, while higher interest rates tend to reduce borrowing, and increase saving over consumption. Below is shown Federal Funds Rate History graph. This data will be used in the current application.

Artificial Neural Network


Neural networks have been used with computers since the 1950s. Through the years, many different models have been presented. The perceptron is one of the earliest neural networks. It was an attempt to understand human memory, learning and cognitive processes. To construct a computer capable of "human-like thought", the researchers have used the only working model they have available - the human brain. However, the human brain as a whole is far too complex to model. Rather, the individual cells that make up the human brain are studied. Following is introduced the schema of the most used artificial neural network.

Multilayer Perceptron
For the task of predicting the indexes, we'll be using the so called multilayer feed forward network which is the best choice for this type of application. In a feed forward neural network, neurons are only connected forward. Each layer of the neural network contains connections to the next layer, but there are no connections back. Typically, the network consists of a set of sensory units (source nodes) that constitute the input layer, one or more hidden layers of computation nodes, and an output layer of computation nodes. In its common use, most neural networks will have one hidden layer, and it's very rare for a neural network to have more than two hidden layers. The input signal propagates through the network in a forward direction, on a layer by layer basis. These neural networks are commonly referred as multilayer perceptrons (MLPs). Shown below is a simple MLP with 4 inputs, 1 output, and 1 hidden layer.

The input layer is the conduit through which the external environment presents a pattern to the neural network. Once a pattern is presented to the input layer, the output layer will produce another pattern. In essence, this is all the neural network does - it matches the input pattern to one which best fits the training's output. It is important to remember that the inputs to the neural network are floating point numbers, represented as C# double type (most of the time we'll be limited to this type). The output layer of the neural network is what actually presents a pattern to the external environment (the result of the computation). The number of output neurons should be directly related to the type of work that the neural network is to perform. There are really two decisions that must be made regarding the hidden layers: how many hidden layers to actually have in the network and how many neurons will be in each of these layers. Problems that require two hidden layers are rarely encountered. There is currently no theoretical reason to use neural networks with any more than two hidden layers, thus almost all current problems solved by neural networks are fine with just one hidden layer. Even though the hidden layers do not directly interact with the external environment, they have a tremendous influence on the final output, thus we should carefully choose the number of neurons within it. Using too few neurons in the hidden layers will result in so called "under-fitting", which occurs when the hidden layers are not able to adequately detect the 6

signals in a complicated data set. The "over-fitting" problem can occur, when the neural network has so much information processing capacity that the limited amount of information contained in the training set is not enough to train all of the neurons in the hidden layers. There are many rule-of-thumb methods for determining the correct number of neurons to use in the hidden layers, here are just a few of them:

The number of hidden neurons should be between the size of the input layer and the size of the output layer.

The number of hidden neurons should be 2/3 the size of the input layer plus the size of the output layer.

The number of hidden neurons should be less than twice the size of the input layer.

Multilayer perceptrons have been applied successfully to solve some difficult and diverse problems, by training them in a supervised manner with a highly popular algorithm known as the error backpropagation algorithm (described further). Please note that in our application, we will be using the Resilient propagation algorithm, which is very similar to back-propagation. The neural network itself will be composed from neurons (main information-processing units as neurons within a human brain) of the same kind, placed within different layers. They will exhibit the same characteristics; hence if we understand how one neuron is designed we will not have problems in understanding how the entire network works. Generally, the model of a neuron can be summarized in the following block diagram:

One can see that there are 3 basic element of a neuronal model: 1. A set of synapses or connecting links, each characterized by a weight or strength of its own: X1,X2,...,Xn with corresponding weights: Wk0, Wk1,...,Wkm. As we will see further, the weights represent the "knowledge" that the neural network contains about a specific training data. Their values will directly affect the output of the neural network. 2. An adder for summing the input signals, weighted by the respective synapses of the neuron: Vk = (WkjXj+bk), where k=[1,r], (r=number of neurons), j=[1,m] (m=number of input synapses). Simply speaking - the input signal X is multiplied by the weight W and summed in the adder with all the other items. The result of this summation V will go to the input of the activation function. 3. An activation function for limiting the output of a neuron: Yk = (x). The activation function has an important role in the schema of a neuron. It generates the output according to the summed input signals calculated in the adder. Summarized, the output signal of each neuron can be defined as follows: Yk = ((WkjXj+bk)). It is important to emphasize that if we want to use Back Propagation learning algorithm for training, then we should take care that our activation function is differentiable. This requirement comes from the fact that since this method requires computation of the Gradient of the error function at each iteration step, we must guarantee the continuity and differentiability of the error function. A commonly used non-linearity that satisfies this requirement is sigmoid non-linearity defined by the logistic function: (v) = 1/(1+exp(-v)), where a is the slope parameter of the sigmoid function. By varying the parameter a, we obtain sigmoid functions of different slopes, as illustrated in the following figure (3 different a values):

Supervised Training
Training is the means by which the weights and threshold values of a neural network are adjusted to give desirable outputs, thus making the network adjust the response to the value which best fits the training data. Propagation Training is a form of supervised training, where the expected output is given to the training algorithm. Propagation training can be a very effective form of training for feed-forward, simple recurrent and other types of neural networks. There are several forms of propagation training. We will analyze 2 of them.

Back Propagation Algorithm


Back Propagation algorithm is by far one of the most commonly used algorithms of learning. It is a supervised learning method, and is a generalization of the delta rule. It requires a teacher that knows, or can calculate, the desired output for any input in the training set.

Generally, it can be summarized in the following main steps:


Present a training sample to the neural network. Compare the network's output to the desired output from that sample. Calculate the error in each output neuron.

For each neuron, calculate what the output should have been, and a scaling factor, how much lower or higher the output must be adjusted to match the desired output. This is the local error. 9

Adjust the weights of each neuron to lower the local error. Assign blame for the local error to neurons at the previous level, giving greater responsibility to neurons connected by stronger weights.

Repeat from step 3 on the neurons at the previous level, using each one's blame as its error.

In the below figure, one can visualize the process within which the neural network is trained to work as XOR logical gate.

Generally, XOR problem is considered the "Hello World" application in this field of science. The purpose is very straightforward: we will make our neural network "smart enough" to solve the XOR problem.

10

Truth table: X1 X2 Result 0 0 0

The structure of the neural network is very simple: the input layer consists of 2 elements (XOR gate needs 2 Boolean values as input parameters, thus the input is of size 2). The hidden layer contains 3 neurons and finally the output layer has one, which represents the result of XOR operation. At its initial stage (Iteration 0), the weights between the neurons are assigned random values, thus the network does not contain any valuable information by now. Once, we start using Back Propagation algorithm (Iteration 1 - 59), the weights between the neurons are adjusted in a manner that will decrease the error rate, and will generate the output which we do expect. By Iteration 59 we achieve acceptable error rate, thus training process ends, and we can proudly say that the network contains enough "knowledge" to solve the XOR problem. By visualizing the way the values are changed, we can observe that at the initial iterations they fluctuate dramatically on each step (mathematically speaking the algorithm tries to find the steepest descent for the error function). Once the error value starts decreasing significantly, (Iteration 30-59), theweights of the neural network are adjusted in a more granular fashion. The network was trained with 4 combination of XOR gate. Because of the 2D limitation, the figure itself contains an example of only 1 training set (True - True(encoded as 1)), which ultimately should generate False at the output (encoded as 0). If we are interested in more details related to this algorithm, please consult any available material related to it. I will not discuss the mathematics behind Back Propagation algorithm, because we'll use a framework which already has this algorithm implemented (Encog framework).

11

Output
4 outputs correspond to each of indexes on the input (S&P500, DOW, NASDAQ Composite, and Prime Interest Rate). The neural network job will be to find hidden patterns in the input data which influences the overall output. After training the network using 40-41-41-4 topology (40 input units, 2 hidden layers with 41 units, 4 outputs), and trying to predict the values, the following results have been obtained:

As one can see, the network is able to interpolate the results in a fairly good manner. The error rate summed over the entire training session decreased to a value of ~0.008. Of course, we cannot consider this data as an input to your investment strategy, since past information does not really indicates future returns (a more granular approach should be developed as the fluctuations are dependent of many other strategical data), but for the academical purpose we can consider this as a good result.

12

Conclusion
In this article, the topic of neural networks and their prediction capabilities have been analyzed. Feed forward neural networks proved to be a reliable solution for applications that need to predict something. Generally speaking, function interpolation is one of the major fields of study in stock market environment. A strategy based upon technical indicators, can really help we in achieving good trading results. Of course, the application that is presented in this article cannot be used in a real world environment, because normally we would need not only an almost precise prediction, but also a program that will perform the market analysis in short bursts (each 15-30 seconds), opposite to the values predicted in this application (closing stock value). In order to achieve better results, we would rather want to combine classical trading strategy with one based upon real-time technical indicators. As for the studying purposes, the main objective has been achieved. It is important to mention that Encog framework for neural networks was used while developing the application. In my opinion, it is the best choice one can have while choosing an API for NN.

13

Você também pode gostar