Você está na página 1de 10

Using Artificial Neural Networks & Twitter Sentiment

Analysis to Predict Stock Movement


Harshil Mattoo (hm372), Rohan Patel (rp432)

I. INTRODUCTION
The United States stock market is nearly $20
trillion in total capitalization, and predicting its
next move is the crown jewel of Wall Street.
Various trading strategies have been developed
using quantitative algorithms to execute more profitable trades and enormous resources are poured
into gaining even the slightest bit of competitive
information. With the prevalence of insider trading
trials, it is clear the huge lengths some investors
will go to gain an edge. There are many financial
features that could potentially correlate with the
US marketplace, such as stock market momentum,
commodity prices, foreign exchange rates, foreign
stock exchanges, and general public opinion. Using
these features, one can develop trading strategies
tailored towards day trading (intraday, high frequency), week trading, or position trading (long
term holding).
Given the vast variety of factors that influence
the stock market, powerful analytical tools are necessary to correctly decipher and predict the erratic
movement of stock prices. Recently realized as
possessing enormous oppurtunities in this domain,
the methodology and algorithms employed in the
field of machine learning have the potential to
drastically improve upon the more traditional and
commonly used methods of stock market prediction used in the market today.
This paper aims to explore and evaluate several
different methods on their effectiveness in predicting the movement of stock prices.
The initial methods which were used to approach this problem utilize the principles of momentum and correlation trading to construct basic
training data from historical data. Afterwards, this
data is naively fed into several popular machine
learning algorithms to determine their effectiveness

in predicting future data.


The second approach, on the other hand, extensively and carefully delves into the ability of artificial neural networks (ANN) to predict stock market
when trained on historical stock data. ANNs have
the remarkable ability to serve as general function
approximators (including for non-linear functions)
when they posess a non-zero amount of hidden layers. Thus, when the task of stock market prediction
is interpreted as approximating the function that
takes as input all economic and social determinants
and outputting the next days change in price,
ANNs stand out as highly effective models to
tackle this task due to their ability to accomodate
both an arbitrary number of inputs and outputs.
Recently, the popular microblogging service
Twitter has evolved into the premier platform for
receiving the latest news and updates in news,
sports, finance, and other popular media. Our final
method sets out to test whether this is also true
for news that influences the stock market. Wall
Street is famous for immediately pricing in any
additional information; however, Twitter provides
a large base of subtle popular sentiment that may
be difficult to act on. Using a number of sentiment
analysis algorithms to analyze tweets referencing
specific stock tickers, we classified tweets into
two categories, positive or negative. From these
classifications we attempt to correlate tweets to
actual movements in stock price.
We were able to attain around 70% accuracy for
determining the sentiment of tweets using our best
machine learning algorithms. Using the sentiment
of tweets, we were able to correctly predict nearly
75% of the movement of stock prices for the
S&P500 in September 2014.

II. PROBLEM DEFINITION AND DATA


COLLECTION
A. Task Definition
The main questions that we aimed to answer in
this project are the following:
What features correlate well with the stock
market?
How effective are several variant machine
learning algorithms in learning to predict
stock price movement?
Can machine learning algorithms understand
the sentiment of tweets and if so, is Twitter
sentiment a good feature for the prediction of
the stock market?
Stocks are generally considered to be pricedin based on various public information. In other
words, market prices are generally a reflection
of the overall opinion and projection of a companys health and value. While the stock prices
are ultimately governed by the laws of supply
and demand, we can quantify these values by
calculating underlying factors. From past market
performance to the latest social trends, a multitude
of features can be used to predict and project future
stock movement. Within the scope of this project,
momentum trading, correlations trading, and trading based on public sentiment is explored. For
the former approaches, various machine learning
were used. As for our sentiment analysis, we chose
Twitter as our means for extracting the general
perception of various publicly traded companies.

meaning that the raw spreadsheet included the


date and variables interleaved, and sorted by
ticker. To deal with this, we then split the data by
variable into five separate spreadsheets, a format
more readily applicable for our purposes.
Data from Google came with a few errors,
ranging from every value for some companies
missing, to single days missing. We corrected these
instances by manually consulting Google Finance.
In a couple of cases, not only did the spreadsheet
fail to return data on a company, but Google itself
lacked complete data on the company. In these
cases, we turned to Yahoo for the numbers. In one
case, Google interpreted PNR, the ticker of Pentair
plc. on the New York Stock Exchange, as Pacific
Niugini Limited, a thinly traded penny stock on
the Australian Securities Exchange. Were our goal
to identify likely cases of securities fraud, then
this may have been a useful and interesting set
of observations. However, this is not the case, so
we dropped Pacific Niugini and replaced it with
the slightly more reputable Pentair, whose data we
again retrieved manually from Google Finance.
Finally, we ensured that the data were consistent
by comparing opens, lows, highs, and closes for
each day, and found no instances where the open
or close fell above the high or below the low for the
day, providing an additional degree of confidence
in a data set that was very welcome, given the
amount of cleaning required.

B. Collecting and Cleaning Stock Data


In order to collect stock data, we used Google
Finance. More specifically, we used a function in
Google Spreadsheets (=GOOGLEFINANCE(ticker,

After experimenting with Twitters Search API,


we quickly realized its limitations on the size of
data collection. Therefore, we searched for and
found a complete Twitter stream monthly dataset.
We chose the month of September 2014 as the
month of analysis. This decision was made considering that:
September was one of the most recent months,
thereby keeping the analysis as current as
possible
September is the one of the months least
affected by seasonal variations in the overall
stock market
September 2014 was an average month in
terms of financial news, thereby allowing for
generalizations from the results

[attribute], [start date], [end date]))

that creates an array of data including either days


volume, open, high, low, or closing prices, for
the specified stock ticker for a range of dates. We
applied this function over the list of companies in
the index, which we obtained from Merrill Lynchs
online retail brokerage service. The result was a
raw spreadsheet including each of these variables
for each of the components of the Standard and
Poors 500, from September 2nd to October 2nd,
2014. Regrettably, this approach does not allow
for the reading of multiple stock tickers at once,

C. Collecting Twitter Data

After obtaining the dataset, identifying the


search query format that will give best results
took trial and error. The result was to search
$ticker<space> ticker DJIA, S&P500.
Then we wrote a Python script to break up
the 45GB dataset into smaller pieces and run the
above search query on each of the smaller datasets,
generating the following columns:
Ticker
Date and time of tweet
Text of tweet
Reference to tweet in the original dataset
In the final dataset, we had 4784 tweets for
S&P500 and 934 tweets for DJIA.

insight into the US stock market. The behavior of


overseas markets prior to when the US markets
open in particular may be a great indicator to the
trends of the US market in the day.
We ran the same algorithms as we did in
momentum trading but this time we also incorporated into our feature vectors the silver price,
gold price, and whether foreign markets (Nikkei,
DAX, Shanghai Composite) generally increased or
decreased. Using this expanded feature representation on the Naive Bayes classifier gives the best
accuracy around 56%. This is however still not
where we wanted our algorithms to be, so we
looked towards more powerful methods such as
ANNs.

III. M ETHODS

C. Artificial Neural Networks


Artificial neural networks attempt to mimic the
neuronal connections that exist in the brain along
with their respective adaptive behaviors in response to novel stimuli. This adaptive behavior
is partially believed to contribute to the ability of
humans and other species to learn. ANNs utilize
this same principle of adaptive connections in
order to learn from training data to predict new
data. ANNs function by taking in a set of realvalued inputs into its input neurons and then in
turn inputting these inputs, weighted by the a set
of connections, into a hidden layer and so on until
finally the outputs are determined in the final
layer. The net inputs from one layer to another are
transformed by an activation function, similar to
how the biological neuron only fires if a certain
threshold of stimulus is reached. An artificial neural network learns by adjusting the weights of its
connections in cases of prediction error by utilizing
an optimization algorithm, specifically stochastic
gradient descent in the case of the most typically
used back-propagation learning algorithm.
As in our previous approaches, we attempted
to predict whether the stock will rise or fall in
each day of September 2014. Instead of training
on three years of data, however, only the months
in 2014 leading up to September were used as
the training set. In this scenario, we trained purely
on the aggregate S&P500 data instead of looking
at each component company within the index. To
retrieve this data, the Quandl API was used to
extract data in TSV format from Yahoo Finance.

A. Momentum Trading
To begin with, we trained an algorithm online
using the past behavior of stocks as feature vectors.
For example, if a certain day had a stock price rise
in price the feature would be 1, and if the price
fell the feature would be 0. The output variable
would be whether the following days price went
up or down.
Using this feature representation, we implemented and trained a Nearest Neighbor, Decision
Tree, Naive Bayes, and Random Forest classifier
using standard libraries provided by SciKit. Using
the last 10 days as history, the stocks behavior
for the next day would be predicted. We ran the
algorithms for stock prices in the US since January
2011. Unfortunately, we were typically stuck at
around 49% to 52% accuracy rates using a wide
variety of different parameters.
At best, the Multinomial Naive Bayes algorithm
had a 54% accuracy on the Dow Jones Industrial
average. This performance is hardly better than
random prediction (and is even worse in certain
cases), so clearly stronger features are necessary
for higher performance for these algorithms.
B. Correlations Trading
In [4], the researchers use a prediction algorithm
that exploits the temporal correlations of global
stock markets and other financial data to predict
the behavior of US stocks in the following day.
Because of the fact that no financial market in the
world is isolated, all this data should provide great

The TSV data contained for every market data in


the year, the opening price, the high price, the low
price, and the closing price of the S&P500.
The open-source neural network library Neuroph for the Java programming language was
used to construct the neural networks themselves.
Though the mechanics of the neural network and
the corresponding learning algorithms were outsourced, the tasks of data-preprocessing, feature
vector construction, hyper-parameter optimization,
and fitting this entire pipeline together had to be
done without use of external libraries in Java.
After the data was extracted from the TSV file,
all the values for each feature of the rows in
the data (which consists of date, opening price,
high, etc.) were normalized to zero mean and
unit variance. This was accomplished using the
following normalization formula:

This normalization step increases both the efficiency and effectiveness of the neural network
learning algorithms. Another component of the
data pre-processing step involved identifying and
filling in missing portions of the data. This was
again accomplished using the Quandl API.
After the data-preprocessing step was completed, the training feature vectors had to be constructed. This is achieved by looking at a moving frame of length k in data. Thus, a feature
vector would consist of the stock price data of
the past k days and the output variable would
be the k+1 days stock price movement (increase
or decrease). Unlike in our previous algorithms,
where the only data considered from the previous
days were whether it went up or down, this feature
representation incorporated for each past k days,
the opening price, high price, low price, and closing price of that day in order to increase precision.
Other feature vectors representations were also
tested and is elaborated in further detail below.
To achieve a good accuracy, many of the parameters of the neural network itself, what features
should be used in the first place, as well as the feature vector normalization and representation itself
had to be considered. All the numerous factors that
had to be considered in the hyper-parameter optimization, feature selection, and feature engineering

Fig. 1. The set of features used versus the accuracy gained using
these features. The numbers correspond to the following features0: Opening Price 1: High Price 2: Low Price 3: Closing Price

process are discussed below:


1) Features to Use: The features which we incorporate into a learning algorithm can drastically
effect how accurate our neural network classifier
will be, though this principle generally applies
to all the machine learning classifiers which we
utilize in this paper. Having primarily irrelevant
features relative to the output variable can lead to
a classifier weighing the importance of a particular
attribute heavily though in reality there may be no
real relationship. In addition, in the case of large
training data, usually little to no information gain
occurs from analyzing this feature. If a classifier
does indeeed heavily weight a non-relevant feature,
then it is more likely to incorrectly predict novel
data in the future because it will not pay due
weight to more relevant aspects of the data and
it also may have been overfitted on the training
data. Therefore, feature selection is an essential
part of any machine learning analysis process. It is
particularly important for this analysis in particular
as the task of predicting the stock market may
involve hundreds of variables, some or even most
of of them largely inconsequential. Even among
the four features we currently are looking at for the
neural network, the inclusion of some features may
not increase the classification efficacy and may be
redundant. In fact, it may even detriment it. To
determine which features contributed to accuracy
and which detrimented, we performed a grid search
on the potential features of opening price, high
price, low price, and closing price. The results
of this grid search on a sample of the feature
combinations are shown below:

The results demonstrate that using all the features actually performed very effectively (around
64% accuracy which is significantly better than
random guessing given the large test set size of
45 predicted days), though ignoring the opening
price and focusing on the high price, low price,
and closing price seemed to perform similarly well.
Ultimately, in our final classifier, we chose to
leverage all the available features as it performed
the best.
2) Feature Engineering: There were two feature vector representations that were tested for
their efficacy. The first feature vector representation is that if are looking at the past k days, then we
have k*(number of features considered) number
of components in the input feature vectors. For
example, the first component of the feature vector
correspondings to the kth days opening price, the
second component corresponds to the kth days
closing price, and so until the day k days before the
current days opening and closing price is added
respectively to feature vector. The second possible
feature vector representation involves taking the
average of the past k days feature values instead of
inputting every single one of them into the neural
network. After extensive testing of both feature
vector representations with different parameters,
the neural networks trained on the second feature
vector was found to consistently outperform those
trained on the first feature vector.
3) Number of Previous Days to Train On (Granularity): Determining how many days k days into
the past we should consider is a challenging question based on principles of finance alone. Some
analysis may argue that because of the efficient
market hypothesis, there is absolutely no relation
between a previous days behavior and the nexts
and consequently no way to predict future trends.
Other analysts may argue that the efficient market
hypothesis does not hold in the existing financial
environment and that past trend data can indeed
reveal a lot about future trends. Instead of having
to debate the current economic status of the world,
we can determine how effective looking at the
past k days is by simply looking at the results of
prediction by varying k, the granularity.
The results clearly show that looking at the past
data can definitely aid in predicting a nexts day

Fig. 2. The number of previous days (granularity) analyzed to


predict the next days accuracy

accuracy. If this were not the case, our accuracies


should have been around 50%, which is equivalent
to randomly guessing. Though varying k lead
to relatively stable results in the beginning, the
optimal granularity peaked at k= 19. Making the
granularity coarser (increasing k) after this point
started to significantly decrease our prediction accuracy. Thus, in our final classifier we chose to use
a granularity of k = 19.
4) Number of Layers: In the case of a neural
network, the number of hidden layers is also a
critical component of the topology. For example,
if a neural network contains simply an input layer
and an output layer, but no hidden layer, its ability
to classify data by approximating more complex
functions like XOR is severely limited. At the
same time, extra hidden layers beyond the first
one often do not contribute any extra classification
refinement (though this is not always the case).
In our scenario, we found that adding extra hidden layers beyond the first greatly increased the
training time but provided no noticable change in
classification ability. At a certain point, the classification accuracy actually decreased. Therefore,
in our final classifier we chose to have one input
layer, one hidden layer, and one output layer as
part of our feed forward network topology.
5) Number of Neurons in Each Layer: Once
we decide the number of layers, we must also
decide how many neurons should be within each
layer. In the output layer, this question is relatively
obvious. There is only one neuron in the output
layer because the only thing which are predicting
is whether the next days stock goes up or down.

Fig. 3. The variation in the prediction accuracy of the 3-layer


neural net as a function of the number of nodes in its hidden layer

The output neuron in this case outputs a number


close to 1 if it predicts the stock will go up and
will output a number close to 0 if it predicts it
will go down. Multiple neurons in the output layer
are usually used to predict multiple outputs which
is not the case in this scenario. The input layer
node is also fairly obvious the number of neurons
corresponds to the number of components in the
feature vector. So, for example, if we are looking
at the past 5 days data, then the number of input
neurons would be 5*4 = 20 because for each of
the past 5 days, we are looking at the 4 features
discussed previously. If we are using the seconde
feature vector representation discussed, then no
matter the granularity, 4 input nodes are used
which represent the average of each feature from
the past k days. The hidden layer node requires
more careful consideration as it is not immediately
obvious how many neurons in the hidden layer are
necessary to appropriately capture the complexity
of the function which we are trying to approximate.
To determine the number of neurons to use in this
layer, we iterated through many possible neuronal
counts to determine the optimal point.
Though many of the hidden layer node counts
resulted in a similar accuracy, the optimal point
was clearly reached at number of hidden nodes =
19. Thus, we used this value in our final classifier.
6) Connections Between Each Layer: Yet another important factor to consider regarding the
topology of the network is how the nodes are
connected from one layer to the next. In some
tasks, it may be pertinent that some input nodes
only connected to a fraction of the hidden layer

nodes to limit individual influence. However, in


our task, we found that connecting the input node
to every hidden layer node worked well so we
chose this structure for our final classifier.
7) Initial Connection Weights: At the start of
the training, the weights on the neural network are
randomly initialized. This is significant because
this leads to non-deterministic results and varying accuracies after training. Some initial weight
initializations lead to better weights after backpropagation than others. This is because a local
minima is reached from the optimization algorithm
that is very sub-optimal compared to another local
minimat that results from a different initialization. To counteract the effects of a single bad
set of seed weights, an ensemble method is employed. Innstead of training the data on one neural
networks, five different neural networks all with
different random connection weight initilizations.
This reduces then variance in the prediction and
usually prevents a single or a few bad seed weights
from dominating the prediction.
8) Learning Algorithm + Learning Rate: The
specific learning algorithm that is used to train
the network acutely impacts the resulting set of
weights after training. For example, even if two
neural networks started with the same set of
weights, if a different learning algorithm is used,
the connections weights between these two networks can vary greatly after training. It is difficult
to determine by mere intuiton what is the optimal
learning algorithm in this scenario so we again performed a grid search like what has been previously
done to determine the optimal learning algorithm.
The results are shown in figure 4.
The results clearly show that Back Propagation
was the best algorithm for our needs in this scenario as it significantly outperformed in terms of
accuracy all the other learning algorithms. Thus,
our final classifier implemented back propagation.
9) Activation Function: The type of activation
function used also can affect the predictions which
a neural network makes because it influences what
net inputs yield a firing and what net inputs do
not. Two neural networks with different activation
functions may not fire on the same set of inputs
or if they do both fire, their outputs might be
different. Thus, the type of activation function is

Fig. 4.
domain

The efficacy of various learning algorithms within this

Fig. 5. The efficacy of various differentiable activation functions in


relation to the currrent archicture and use of the back-propagation
learning algorithm

yet another parameter which must be optimized in


our classifier. Given that back propagation learning
algorithm necessitates that the activation function
be differentiable, the activation functions which
considered in our optimization process were only
the ones which were differentiable:
The hyperbolic tan function yielded the best results among the differentiable activation functions.
Thus, our final classifier utilized this activation
function.
10) Number of Neurons in Ensemble: As was
previously discussed, the set of initial weights can
drastically affect prediction accuracy due to the
nature of the optimization algorithm that yields a
local minima, not a global minima. Some local
minimas that result from back-propagation can
have much worse generalization errors than others.
Thus, to counter-act this problem multiple neural
networks are trained and the ultimate prediction is
an ensemble vote of these neurons. Unforunately,
training more neurons is very expensive to train.

Fig. 6. The efficacy of various differentiable activation functions in


relation to the currrent archicture and use of the back-propagation
learning algorithm

Thus, we needed to find an optimal number of


neurons per ensemble that finds an optimal balance
between the variance reduction provided by extra
neural networks with the time taken to train the
ensemble. Using the requirement that the entire
ensemble should be able to be trained on 8 months
of data within one minute, we determined that
having five neural networks in the ensemble was
optimal.
11) Max Iterations of the Learning Algorithm:
Another time-saving considering involves limiting
the number of iterations that occurs during the
training of the neural network through some learning algorithm. Often times, training the network
until its error on the training set becomes lower
than some arbitrary threshold value either takes a
very long time or is impossible to do on some
data sets. One way to both save time and prevent
the training of the neutral network from never
finishing, a limit can be placed on the iterations
that the learning algorithm uses. At the same time,
however, the max iterations should not be too low
otherwise the weights will not be sufficiently adjusted. We must thus determine the number of max
iterations such that extra iterations of the learning
algorithm provides little increase in accuracy.
The graph demonstrates the aforementioned intuiton as it shows that when max iterations of the
learnign algorithm is low, the ultimate accuracy
is low. However, once 10000 max iterations are
reached, extra iterations provide little to no boost
in accuracy while still taking a longer time to
complete training. Therefore, our final classifier
uses 10000 max iterations.

Fig. 7. The final topology of the hyper-parameter optimized neural


net classifier

Using our final neural network classifier with


the parameters detailed above, we were able to
get around an average of around 64% accuracy in
predicting whether each of the days of September
2014 would result in an increase or decrease in the
stock market. However, with a particularly fortuitous set of initial weights, this accuracy bumps up
to 68% accuracy. The final neural net classifiers
architecture is depicted in figure 7.
D. Sentiment Analysis
In this approach, we conducted two sets of
machine learning analyses - first for learning
and predicting twitter sentiment, and then using
these sentiments in turn for predicting stock price
changes.
First, we conducted sentiment analysis to categorize tweets into positive and negative. We decided to go with a majority-of-algorithms based
system of classification, as there is no one algorithm that is a perfect fit to this problem, and
using a majority based approach would allow our
analysis to be more nimble even as properties of
the dataset change.
Now to pick which algorithms would be optimal
for our classification, we trained sentiment analysis
classifiers using eight different models:
Support Vector Machine
Generalised Linear Models
Linear Discriminant Analysis
Random Forest
Bagging
Boosting
Neural Networks
Classification trees
The classifiers were trained using a database
of 5513 hand-classified tweets [8]. Each tweet

was represented in a standard bag-of-words model.


That is, the order of the words and relationships
among the words were ignored and only what
words appear and the frequency of words were
considered. Thus, a feature vector representing the
size of the English vocabulary is created for each
tweet. Of course, tweets contain a small fraction of
the English vocabulary so a sparse representation
was used. These feature vectors were then L2normalized.
Out of these models, the F-score obtained using
5-fold cross-validation for the last five algorithms
was below 40% for either sentiment. Therefore,
we decided to go ahead with only the first three
algorithms as they gave better results on this
dataset.
TABLE I
F-S CORES OF S ENTIMENT A LGORITHMS
Algorithm
SVM
LDA
GLMNET

Negative Sentiment
67%
67%
66%

Positive Sentiment
72%
69%
72%

After this initial training and testing, we trained


a majority based stock classfier on a training set
of 30,000 randomly chosen tweets from a corpus
of 1.6 million manually categorized tweets. The
classifier outputs positive only if at least 2 of the
algorithms agree upon a positive classification, and
negative if otherwise.
E. Price Prediction
Second, we extracted all the tweets that mention
particular stock tickers and calculate the corresponding number of positive and negative tweets
for that stock as well as the change in stock price
from tomorrows opening to tomorrows closing.
For tweets made on the weekend, we used the
opening and closing prices for Monday.
Then we conducted supervised learning, training
a set of algorithms to predict whether the stock
price went up or down the next day based on our
sentiment of the previous days tweets.
Once again, we trained three different algorithms for the task and took the majority outcome:
Support Vector Machine
Naive Bayes

Classification Tree
The training was done on 9,000 randomly selected entries of the September stock price prediction. Then we tested the results on 4,000 of the
remaining entries from the September dataset.

IV. RESULTS AND DISCUSSION


The following are results obtained for predicting
stock prices for S&P500 for September 2014.
There did not seem to be much bias in accuracy
of positive vs negative classifications of price
change. However, while our algorithm generally
made greater positive classifications, this was to be
expected as the bull market often outpaces the bear
market. To put things in perspective, the average
annual return of the market overall is generally
around 7%. Our accuracy rate reached nearly 75%,
a rate that exceeds the random walk model of
predicting stock market movement.

Fig. 8. F-Scores of the highest peforming sentiment classification


algorithms

TABLE II
C ONFUSION MATRIX FOR STOCK PRICE PREDICTION

Predicted -ve
Predicted +ve

True -ve price change


1864
524

True +ve price change


585
1356

TABLE III
C OMMON PERFORMANCE MEASURES
Performance Measure
Error Rate
Accuracy
Precision
Recall
F-Score

Percentage Value
25.62%
74.38%
76.11%
78.06%
77.07%

V. FUTURE WORK
We obtained a very significant success in predicting stock price the next day based on a days
twitter sentiment. Going forward, we would like
to extend this analysis in several ways.
A. Improving models

improving sentiment accuracy by having a


clearer identification of high value tweets
vs. non-informative tweets. For this, a larger

dataset would be required. Using worddependency based models of tweet modeling instead of a bag of words representation
also has the potential to increase sentiment
classification accuracy. As an example of a
more subtle model of a tweet, one can instead
represent a feature vector as a parse tree
which can be fed into an SVM as a piece of
non-vectorial data. A custom kernel defining
similarity between parse trees can then be
defined.
improving stock price prediction and generalization of our analysis by considering multiple
months over multiple years.
considering non-standard neural network
topologies such as recurrent and convolutional
networks as potential models beyond the feedforward methods used in this paper.
adding a neutral category for tweets as well
as buying decisions. Currently, even a mildly
positive sentiment and tweet would lead to a
buy decision, which may not be optimal in a
real-world setting.

B. Translation to investor decisions


Of course, it is worth noting that the current
analysis doesnt translate directly into a buying
or a shorting decision. There are several costs
associated with converting our results into a realworld process:
Significant costs are associated with creating
a portfolio and maintaining it throughout the

years
Frictional costs could limit the profit generation, especially as our analysis recommends
daily buying and selling decisions, which is
smaller than the typical decision time (about
6 months buying for retail investors)

C. Translating model for institutional investors


It is also worth noting that our current analysis
is for retail investors. The other type of investors
- institutional investors - enjoy significant benefits
over retail investors, such as:
Lower transactional costs
Higher liquidity allowing for greater overall
dollar-valued returns
Access to after-hours trading
These could translate to a higher accuracy if
this model is translated into decisions for institutional investors, after appropriate adjustments to
the model. For example, our model can account
for one of these - access to after-hours trading by using change in closing price from one day
to closing price the next, versus the current model
that uses the next days opening and closing prices.
VI. CONCLUSIONS
In our study, we were able evalaute the efficacy
of several powerful machine learning algorithms
in the context of stock market prediction. Each of
these evaluations involved constructing and testing many different types of feature vector representations and parameters, a step imperative in
this complex domain. In addition, we tested the
ability of machine learning algorithms to utilize
the sentiment analysis of various tweets involving various publicly traded companies in order
to project positive or negative changes in share
price. Although we had a large dataset and many
different algorithms to work with, we ultimately
were able to narrow our work down to three
specifics algorithms that yielded a significant accuracy on a much smaller, more relevant set of
training examples (e.g. tweets). While our project
has shown promising results, additional studies of
natural language processing and machine learning
can be used towards generating even more statistically relevant data. We hope to be able to further
extend our work in order to create statistically

driven trading strategies that can be employed for


everyday use.
R EFERENCES
[1] Alec Go, Richa Bhayani and Lei Huang, Twitter Sentiment
Classification using Distant Supervision, unpublished.
[2] Package RTextTools. 2014
[3] Archive Team JSON Download of Twitter Stream 2014-09.
2014
[4] Shungron Shen et. al. Stock Market Forecasting Using Machine Learning Algorithms www.stanford.edu
[5] Ciumac Sergiu et. al. Financial Predictor via Neural Network.
2011
[6] Mahdi Pakdaman Naeini, Hamidreza Taremian, Homa
Baradaran Hashemi, Stock Market Value Prediction Using
Neural Networks. 2010
[7] Stock Prediction using Artificial Neural Networks
[8] Sanders
Analytics.
Twitter
Sentiment
Corpus.
http://www.sananalytics.com/lab/twitter-sentiment/

Você também pode gostar