NeuralNetworks Assign1

NEURAL NETWORKS
Submitted By:
Arti Vaidya
Anjali Anjali
Divya Durgadas
Janani Natarajan
Course Teacher : Prof. Anita Wasilewska
State University of New York at Stony Brook
What is a neural network (NN)?
Neural networks is a branch of "Artificial Intelligence". Artificial Neural Network is a

system loosely modeled based on the human brain. The field goes by many names, such as
connectionism, parallel distributed processing, neuro-computing, natural intelligent systems,
machine learning algorithms, and artificial neural networks.
A vague description is as follows:
An ANN is a network of many very simple processors ("units"), each possibly having a (small
amount of) local memory.
The units are connected by unidirectional communication channels ("connections"), which

carry numeric (as opposed to symbolic) data.
The units operate only on their local data and on the inputs they receive via the connections.
The design motivation is what distinguishes neural networks from other mathematical
techniques
A neural network is a processing device, either an algorithm, or actual hardware, whose design
was motivated by the design and functioning of human brains and components thereof.
Most neural networks have some sort of "training" rule whereby the weights of connections
are adjusted on the basis of presented patterns.
In other words, neural networks "learn" from examples, just like children learn to recognize
dogs from examples of dogs, and exhibit some structural capability for generalization.
Neural networks normally have great potential for parallelism, since the computations of the
components are independent of each other.
Introduction
Biological neural networks are much more complicated in their elementary structures
than the mathematical models we use for ANNs.
It is an inherently multiprocessor-friendly architecture and without much modification,
it goes beyond one or even two processors of the von Neumann architecture. It has
ability to account for any functional dependency. The network discovers (learns,
models) the nature of the dependency without needing to be prompted.
Neural networks are a powerful technique to solve many real world problems.
They have the ability to learn from experience in order to improve their
performance and to adapt themselves to changes in the environment. In addition
to that they are able to deal with incomplete information or noisy data and can be
very effective especially in situations where it is not possible to define the rules or
steps that lead to the solution of a problem.
They typically consist of many simple processing units, which are wired together in a
complex communication network.
Introduction
There is no central CPU following a logical sequence of rules - indeed there is no set of rules or
program. This structure then is close to the physical workings of the brain and leads to a new type of
computer that is rather good at a range of complex tasks.
In principle, NNs can compute any computable function, i.e. they can do everything a normal digital
computer can do. Especially anything that can be represented as a mapping between vector spaces
can be approximated to arbitrary precision by Neural Networks.
In practice, NNs are especially useful for mapping problems which are tolerant of some errors and
have lots of example data available, but to which hard and fast rules can not easily be applied.
In a nutshell a Neural network can be considered as a black box that is able to predict an
output pattern when it recognizes a given input pattern. Once trained, the neural network is
able to recognize similarities when presented with a new input pattern, resulting in a predicted
output pattern.
The Brain
The Brain as an Information Processing

System
The human brain contains about 10 billion
nerve cells, or neurons. On average, each
neuron is connected to other neurons
through about 10 000 synapses. (The actual
figures vary greatly, depending on the local
neuroanatomy.)
Computation in the brain
The brain's network of neurons forms a massively parallel information processing system. This
contrasts with conventional computers, in which a single processor executes a single series of
instructions.
Against this, consider the time taken for each elementary operation: neurons typically operate
at a maximum rate of about 100 Hz, while a conventional CPU carries out several hundred
million machine level operations per second. Despite of being built with very slow hardware,
the brain has quite remarkable capabilities:
Its performance tends to degrade gracefully under partial damage. In contrast, most programs
and engineered systems are brittle: if you remove some arbitrary parts, very likely the whole
will cease to function.
It can learn (reorganize itself) from experience.
This means that partial recovery from damage is possible if healthy units can learn to take over
the functions previously carried out by the damaged areas.
It performs massively parallel computations extremely efficiently. For example, complex
visual perception occurs within less than 100 ms, that is, 10 processing steps!
It supports our intelligence and self-awareness. (Nobody knows yet how this occurs.)
As a discipline of Artificial Intelligence, Neural Networks attempt to bring computers a little
closer to the brain's capabilities by imitating certain aspects of information processing in the
brain, in a highly simplified way.
Neural Networks in the Brain
The brain is not homogeneous. At the

largest anatomical scale, we
distinguish cortex, midbrain,
brainstem, and cerebellum. Each of
these can be hierarchically subdivided
into many regions, and areas within
each region, either according to the
anatomical structure of the neural
networks within it, or according to
the function performed by them.
The overall pattern of projections (bundles of neural connections) between areas is extremely complex,
and only partially known. The best mapped (and largest) system in the human brain is the visual system,
where the first 10 or 11 processing stages have been identified. We distinguish feedforward projections
that go from earlier processing stages (near the sensory input) to later ones (near the motor output), from
feedback connections that go in the opposite direction.
In addition to these long-range connections, neurons also link up with many thousands of their
neighbours. In this way they form very dense, complex local networks
Neurons and Synapses

The basic computational unit in the
nervous system is the nerve cell, or neuron.
A neuron has:
Dendrites (inputs)
Cell body
Axon (output)
A neuron receives input from other neurons (typically many thousands). Inputs sum (approximately).
Once input exceeds a critical level, the neuron discharges a spike - an electrical pulse that travels from
the body, down the axon, to the next neuron(s) (or other receptors). This spiking event is also called
depolarization, and is followed by a refractory period, during which the neuron is unable to fire.
The axon endings (Output Zone) almost touch the dendrites or cell body of the next neuron.
Transmission of an electrical signal from one neuron to the next is effected by neurotransmittors,
chemicals which are released from the first neuron and which bind to receptors in the second. This link
is called a synapse. The extent to which the signal from one neuron is passed on to the next depends on
many factors, e.g. the amount of neurotransmittor available, the number and arrangement of receptors,
amount of neurotransmittor reabsorbed, etc.
Artificial Neuron Models
Computational neurobiologists have constructed very elaborate computer models of

neurons in order to run detailed simulations of particular circuits in the brain. As
Computer Scientists, we are more interested in the general properties of neural
networks, independent of how they are actually "implemented" in the brain. This
means that we can use much simpler, abstract "neurons", which (hopefully) capture
the essence of neural computation even if they leave out much of the details of how
biological neurons work.
People have implemented model neurons in hardware as electronic circuits, often

integrated on VLSI chips. Remember though that computers run much faster than
brains - we can therefore run fairly large networks of simple model neurons as
software simulations in reasonable time. This has obvious advantages over having
to use special "neural" computer hardware.
A Simple Artificial Neuron

The basic computational element (model neuron) is often called a node or unit. It receives input from
some other units, or perhaps from an external source. Each input has an associated weight w, which
can be modified so as to model synaptic learning. The unit computes some function f of the weighted
sum of its inputs
Its output, in turn, can serve as input to other units.

The weighted sum is called the net input to unit i, often
written neti.
Note that wij refers to the weight from unit j to unit i (not
the other way around).
The function f is the unit's activation function. In the
simplest case, f is the identity function, and the unit's
output is just its net input. This is called a linear unit.
Applications:
Neural Network Applications can be grouped in following categories:
Clustering:
A clustering algorithm explores the similarity between patterns and places similar patterns in a cluster. Best
known applications include data compression and data mining.
Classification/Pattern recognition:
The task of pattern recognition is to assign an input pattern (like handwritten symbol) to one of many
classes. This category includes algorithmic implementations such as associative memory.
Function approximation:
The tasks of function approximation is to find an estimate of the unknown function f() subject to noise.
Various engineering and scientific disciplines require function approximation.
Prediction/Dynamical Systems:
The task is to forecast some future values of a time-sequenced data. Prediction has a significant impact on
decision support systems. Prediction differs from Function approximation by considering time factor.
Here the system is dynamic and may produce different results for the same input data based on system state
(time).
Types of Neural Networks

Neural Network types can be classified based on following attributes:
Applications
-Classification
-Clustering
-Function approximation
-Prediction
Connection Type
- Static (feedforward)
- Dynamic (feedback)
Topology
- Single layer
- Multilayer
- Recurrent
- Self-organized
Learning Methods
- Supervised
- Unsupervised
The McCulloch-Pitts Model of Neuron
The early model of an artificial neuron is introduced by Warren McCulloch and Walter Pitts in
1943. The McCulloch-Pitts neural model is also known as linear threshold gate. It is a neuron
of a set of inputs I1,I2,I3Im and one output y . The linear threshold gate simply classifies
the set of inputs into two different classes. Thus the output y is binary. Such a function can be
described mathematically using these equations:
W1,W2Wm are weight values normalized in the

range of either (0,1) or (-1,1) and associated with
each input line, Sum is the weighted sum, and T is a
threshold constant. The function f is a linear step
function at threshold T as shown in figure
The Perceptron
In late 1950s, Frank Rosenblatt introduced a network composed of the units that were
enhanced version of McCulloch-Pitts Threshold Logic Unit (TLU) model. Rosenblatt's
model of neuron, a perceptron, was the result of merger between two concepts from the
1940s, McCulloch-Pitts model of an artificial neuron and Hebbian learning rule of
adjusting weights. In addition to the variable weight values, the perceptron model added
an extra input that represents bias. Thus, the modified equation is now as follows:
where b represents the bias value.
The McCulloch-Pitts Model of Neuron
Figure :Symbolic Illustration of Linear Threshold Gate
The McCulloch-Pitts model of a neuron is simple yet has substantial computing potential. It also has a precise
mathematical definition. However, this model is so simplistic that it only generates a binary output and also the
weight and threshold values are fixed. The neural computing algorithm has diverse features for various
applications . Thus, we need to obtain the neural model with more flexible computational features.
Artificial Neuron with Continuous

Characteristics
Based on the McCulloch-Pitts model described previously, the general form an artificial neuron can
be described in two stages shown in figure. In the first stage, the linear combination of inputs is
calculated. Each value of input array is associated with its weight value, which is normally between
0 and 1. Also, the summation function often takes an extra input value Theta with weight value of
1 to represent threshold or bias of a neuron. The summation function will be then performed as
The sum-of-product value is then passed into the second stage to perform the activation function
which generates the output from the neuron. The activation function ``squashes" the amplitude the
output in the range of [0,1] or [-1,1] alternately. The behavior of the activation function will describe
the characteristics of an artificial neuron model.
Artificial Neuron with Continuous

Characteristics
The signals generated by actual biological neurons are the action-potential spikes, and the biological
neurons are sending the signal in patterns of spikes rather than simple absence or presence of single
spike pulse. For example, the signal could be a continuous stream of pulses with various frequencies.
With this kind of observation, we should consider a signal to be continuous with bounded range. The
linear threshold function should be ``softened".
One convenient form of such ``semi-linear" function is the logistic sigmoid function, or in short,
sigmoid function as shown in figure. As the input x tends to large positive value, the output value y
approaches to 1. Similarly, the output gets close to 0 as x goes negative. However, the output value is
neither close to 0 nor 1 near the threshold point.
This function is expressed

mathematically as follows:
Additionally, the sigmoid function

describes the ``closeness" to the
threshold point by the slope. As x
approaches to
- infinity or + infinity , the slope is
zero; the slope increases as x
approaches to 0. This characteristic
often plays an important role in
learning of neural networks.
Single-Layer Network
By connecting multiple neurons, the true

computing power of the neural networks
comes, though even a single neuron can
perform substantial level of computation. The
most common structure of connecting
neurons into a network is by layers. The
simplest form of layered network is shown in
figure. The shaded nodes on the left are in the
so-called input layer. The input layer neurons
are to only pass and distribute the inputs and
perform no computation. Thus, the only true
layer of neurons is the one on the right. Each
of the inputs x1,x2,xN is connected to
every artificial neuron in the output layer
through the connection weight. Since every
value of outputs y1,y2,yN is calculated
from the same set of input values, each output
is varied based on the connection weights.
Although the presented network is fully
connected, the true biological neural network
may not have all possible connections - the
weight value of zero can be represented as
``no connection".
Multilayer Network
To achieve higher level of computational

capabilities, a more complex structure of
neural network is required. Figure shows
the multilayer neural network which
distinguishes itself from the single-layer
network by having one or more hidden
layers. In this multilayer structure, the
input nodes pass the information to the
units in the first hidden layer, then the
outputs from the first hidden layer are
passed to the next layer, and so on.
Multilayer network can be also viewed as

cascading of groups of single-layer
networks. The level of complexity in
computing can be seen by the fact that
many single-layer networks are combined
into this multilayer network. The designer
of an artificial neural network should
consider how many hidden layers are
required, depending on complexity in
desired computation.
Backpropagation Networks
Backpropagation networks, and multi layered perceptrons, in general, are feedforward

networks with distinct input, output, and hidden layers. The units function basically like
perceptrons, except that the transition (output) rule and the weight update (learning)
mechanism are more complex.
The figure on next page presents the architecture of backpropagation networks. There
may be any number of hidden layers, and any number of hidden units in any given
hidden layer. Input and output units can be binary {0, 1}, bi-polar {-1, +1}, or may
have real values within a specific range such as [-1, 1]. Note that units within the same
layer are not interconnected.
In feedforward activation, units of hidden layer

1 compute their activation and output values and
pass these on to the next layer, and so on until
the output units will have produced the
network's actual response to the current input.
The activation value ak of unit k is computed as
follows.
This is basically the same activation function of
linear threshold units (McCulloch and Pitts
model).
As illustrated above, xi is the input signal
coming from unit i at the other end of the
incoming connection. wki is the weight of the
connection between unit k and unit i. Unlike in
the linear threshold unit, the output of a unit in a
backpropagation network is no longer based on
a threshold. The output yk of unit k is computed
as follows:
The function f(x) is referred to as the output
function. It is a continuously increasing function
of the sigmoid type, asymptotically approaching
0 as x decreases, and asymptotically approaches
1 as x increases. At x = 0, f(x) is equal to 0.5.
In some implementations of the

backpropagation model, it is convenient to
have input and output values that are bi-polar.
In this case, the output function uses the
hypertangent function, which has basically
the same shape, but would be asymptotic to
1 as x decreases. This function has value 0
when x is 0.
Once activation is fed forward all the way to

the output units, the networks response is
compared to the desired output ydi which
accompanies the training pattern. There are
two types of error. The first error is the error
at the output layer. This can be directly
computed as follows:
The second type of error is the error at the

hidden layers. This cannot be computed
directly since there is no available
information on the desired outputs of the
hidden layers. This is where the
retropropagation of error is called for.
Essentially, the error at the output layer is used to compute for the error at the hidden layer
immediately preceding the output layer. Once this is computed, this is used in turn to compute for
the error of the next hidden layer immediately preceding the last hidden layer. This is done
sequentially until the error at the very first hidden layer is computed. The retropropagation of error
is illustrated in the figure below:
Computation of errors ei at a hidden layer is done as follows:
The errors at the other end of the outgoing connections of the hidden unit h have been earlier
computed. These could be error values at the output layer or at a hidden layer. These error signals are
multiplied by their corresponding outgoing connection weights and the sum of these is taken.
The errors at the other end of the outgoing connections of the hidden unit h
have been earlier computed. These could be error values at the output layer or
at a hidden layer. These error signals are multiplied by their corresponding
outgoing connection weights and the sum of these is taken.
After computing for the error for each unit, whether

it be at a hidden unit or at an output unit, the
network then fine-tunes its connection weights
wkjt+1. The weight update rule is uniform for all
connection weights.
The learning rate a is typically a small value
between 0 and 1. It controls the size of weight
adjustments and has some bearing on the speed of
the learning process as well as on the precision by
which the network can possibly operate. f(x) also
controls the size of weight adjustments, depending
on the actual output f(x). In the case of the sigmoid
function above, its first derivative (slope) f(x) is
easily computed as follows:
We note that the change in weight is directly proportional to the error term computed for the unit at the
output end of the incoming connection. However, this weight change is controlled by the output signal
coming from the input end of the incoming connection. We can infer that very little weight change
(learning) occurs when this input signal is almost zero.
The weight change is further controlled by the term f(ak). Because this term measures the slope of the
function, and knowing the shape of the function, we can infer that there will likewise be little weight
change when the output of the unit at the other end of the connection is close to 0 or 1. Thus, learning
will take place mainly at those connections with high pre-synaptic signals and non-committed (hovering
around 0.5) post-synaptic signals.
Learning Process
One of the most important aspects of Neural Network is the learning process. The
learning process of a Neural Network can be viewed as reshaping a sheet of metal,
which represents the output (range) of the function being mapped. The training set
(domain) acts as energy required to bend the sheet of metal such that it passes through
predefined points. However, the metal, by its nature, will resist such reshaping. So the
network will attempt to find a low energy configuration (i.e. a flat/non-wrinkled shape)
that satisfies the constraints (training data).
Learning can be done in supervised or unsupervised training.
In supervised training, both the inputs and the outputs are provided.
The network then processes the inputs and compares its resulting outputs against the
desired outputs. Errors are then calculated, causing the system to adjust the weights
which control the network. This process occurs over and over as the weights are
continually tweaked.
Summary
The following properties of nervous systems will be of particular interest in our neurally-inspired
models:
In unsupervised training, the network is provided with inputs but not with desired outputs. The
system itself must then decide what features it will use to group the input data. This is often referred
to as self-organization or adaption.
Following geometrical interpretations will demonstrate the learning process within different Neural
Models:
Parallel, distributed information processing
High degree of connectivity among basic units
Connections are modifiable based on experience
Learning is a constant process, and usually unsupervised
Learning is based only on local information
Performance degrades gracefully if some units are removed
References:
http://www.ece.utep.edu/research/webfuzzy/docs/kk-thesis/kk-thesis-html/node12.html
http://www.comp.nus.edu.sg/~pris/ArtificialNeuralNetworks/LinearThresholdUnit.html
http://www.csse.uwa.edu.au/teaching/units/233.407/lectureNotes/Lect1-UWA.pdf
Supervised and Unsupervised

Neural Networks
References
http://www.ai.rug.nl/vakinformatie/ias/slide
s/3_NeuralNetworksAdaptation.pdf
http://www.users.cs.york.ac.uk/~sok/IML/iml
_nn_arch.pdf
http://ilab.usc.edu/classes/2005cs561/note
s/LearningInNeuralNetworks-CS561-3-05.pdf
Understanding Supervised and

Unsupervised Learning
A
A
B A
Two possible Solutions
B B
Supervised Learning
It is based on a
labeled training set.
The class of each
piece of data in
training set is known.
Class labels are predetermined and
provided in the
training phase.
Class
Class
Class
Class
A
Class
Class
Supervised Vs Unsupervised
Task performed
Classification
Pattern Recognition
NN model :
Preceptron
Feed-forward NN
What is the class of

this data point?
Task performed
Clustering
NN Model :
Self Organizing Maps
What groupings exist

in this data?
How is each data point
related to the data
set as a whole?
Unsupervised Learning
Input : set of patterns P, from n-dimensional space S,

but little/no information about their classification,
evaluation, interesting features, etc.
It must learn these by itself! : )
Tasks:
Clustering - Group patterns based on similarity
Vector Quantization - Fully divide up S into a small
set of regions (defined by codebook vectors) that
also helps cluster P.
Feature Extraction - Reduce dimensionality of S by
removing unimportant features (i.e. those that do not
help in clustering P)
Unsupervised Neural Networks

Kohonen Learning
Also defined Self Organizing Map

Learn a categorization of input space
Neurons are connected into a 1-D or 2-D lattice.
Each neuron represents a point in Ndimensional pattern space, defined by N weights
During training, the neurons move around to try
and fit to the data
Changing the position of one neuron in data
space influences the positions of its neighbors
via the lattice connections
Self Organizing Map Network

Structure
All inputs are connected

by weights to each
neuron
size of neighbourhood
changes as net learns
Aim is to map similar
inputs (sets of values) to
similar neuron positions.
Data is clustered
because it is mapped to
the same node or group
of nodes
SOM-Algorithm
1. Initialization :Weights are set to unique
random values
2. Sampling : Draw an input sample x and
present in to network
3. Similarity Matching : The winning neuron
i is the neuron with the weight vector that
best matches the input vector
i = argmin(j){ x wj }
SOM - Algorithm
4. Updating : Adjust the weights of the winning
neuron so that they better match the input.
Also adjust the weights of the neighbouring
neurons.
wj = . hij ( x wj)
neighbourhood function : hij

over time neigbourhood function gets smaller
Result: The neurons provide a good approximation

of the input space and correspond
INTRUSION DETECTION
References
http://ilab.usc.edu/classes/2005cs561/notes/LearningInNeura
lNetworks-CS561-3-05.pdf
2004 IEEE International Conference on Systems, Man and
Cybernetics" A Hybrid Training Mechanism for Applying
Neural Networks to Web-based Applications Ko-Kang
Chu Maiga Chang Yen-Teh Hsia
Data Mining Approach for Network Intrusion Detection, Zhen
Zhang Advisor: Dr. Chung-E Wang04/24/2002,Department of
Computer Science,California State University, Sacramento
A Briefing Given to the SC2003 Education Program on
Knowledge Discovery In Databases,Nov 16 2003,NCSA
OUTLINE
What is IDS?
IDS with Data Mining
IDS and Neural Network
Hybrid Training Model
3 steps of Training
What is an IDS?
The process of monitoring and analyzing the events

occurring in a computer and/or network system in
order to detect signs of security problems
Misuse detection: patterns of well-known attacks.

Anomaly detection: deviation from normal usage
Network based intrusion detection (NIDS) monitors
network traffic
Host based intrusion detection (HIDS) monitors a single
host
What is an IDS?
Limitations of IDS
Limitations of Misuse Detection
Signature database has to be manually revised for each new type of

discovered intrusion
They cannot detect emerging threats
Substantial latency in deployment of newly created signatures

Limitations of Anomaly Detection
False Positives alert when no attack exists. Typically, anomaly

detection is prone to a high number of false alarms due to previously
unseen legitimate behavior.
Data Overload
The amount of data for analysts to examine is growing too large. This is
the problem that data mining looks to solve.
Lack of Adaptability
System has to instantiated each new attack identified. Lack of cooperative training and learning phase of system.
Data Mining Based IDS and alleviate these limitations.
Why Data Mining can help?

Learn from traffic data
Supervised learning: learn precise
models from past intrusions
Unsupervised learning: identify
suspicious activities
Maintain models on dynamic data
Data Mining - Overview
IDS with Data Mining
Neural Networks and IDS
NN is trainable -- Adaptability
Train for known pattern of attacks

Train for normal behavior
Learning and re-training in the face of a
new attack pattern.
Adjusts weights of synapses recursively
to learn new behaviors.
Identify abnormal browser access

behaviors.
Training is carried out offline and used
online.
An intrusion detection model for webased applications.
Model follows.

configure
Feedback / Storing Log
Weight
DB
Decision
Module
configure
New Training Set
Update weights
LOG DB
Decision
Abnormal Behavior
Detection Module Online
Decision
Module
Supervised
Filtering
Offline Training Module
1. Offline Training process
Real and Simulated Datasets with expected

results are put into the neural network for training,
without malicious data
Real
IPAddress
Virtual
IP
Access time
Access
Date
Brows Access
er time type
(msec)
140.245.1.55 67589
AM 09:30:56 2003/10/03 20000
140.245.1.44 67588
PM 08:45:44 2003/11/04 30000
AN - 001
130.158.3.66 64533
AM 07:12:08 2003/11/05 26000
AN - 010
Neural Network Model
IP Addresses are translated to 32 bit- binary and time data is

fed in milliseconds.
The expected outputs from the neural network are
Is it a normal access? (0/1)
What kinds of malicious actions is it? (000/001/010/011)
Can it be positively sure? (0/1)

Since we have input attribute values and expected outputs,
the preparation for training the neural network is done.
Back Propagation Neural Network Model
Input Layer 115 neurons

Hidden Layer 60 neurons
Output Layer 5 neurons
2. Analyze Access Behavior
Trained Neural Network model is used online to judge

whether the access is normal or abnormal.
When confusion arises between similar patterns and new
malicious behavior, store information in database. ( 0 000 0)
(Refer to model diagram)
Decision module identifies the output to be definitive or nondefinitive and stores the information for later review by
administrator.
Administrator categorizes the new patterns as normal or
abnormal behavior and re-trains the NN, through web
interface.
Web Interface for Online Training
3. Online - Training
The administrator engages the neural

network for training online.
Separate copy of the neural network is
being used for analysis
Once the training completes new
neural network replaces the old one.
(Refer to model diagram)
Design Suitability
Internet Ad Industry
Network Security
Antivirus
Using Neural Networks to

Forecast Stock Market Prices
Project report :
Ramon Lawrence
Department of Computer Science
University of Manitoba
umlawren@cs.umanitoba.ca
December 12, 1997
Contents
Abstract
Motivation behind using NN
Traditional Methods
NN for Stock Prediction
Abstract
Neural networks offer the ability to

predict market directions more
accurately, with the ability to discover
patterns in nonlinear and chaotic
systems.
Motivation behind using NN
Neural networks are used to predict

stock market prices because they are
able to learn nonlinear mappings
between inputs and outputs.
It may be possible that NN
outperforms the traditional analysis
and other computer-based methods.
Traditional Methods
Traditional methods used were:

Statistics, technical analysis,
fundamental analysis, and linear
regression.
None of these techniques has proven
to be the consistently correct
prediction tool that is desired.
contd.
However, these methods are presented as

they are commonly used in practice and
represent a base-level standard, which
neural networks should outperform.
Also, many of these techniques are used to
preprocess raw data inputs, and their results
are fed into neural networks as input.
Technical Analysis
Technical analysis rests on the

assumption that history repeats itself
and that future market direction can be
determined by examining past prices.
Ex. Using price, volume, and open
interest statistics, the technical analyst
uses charts to predict future stock
movements.
Fundamental Analysis
Fundamental analysis involves the in-depth

analysis of a companys performance and
profitability to determine its share price.
By studying the overall economic conditions,
the companys competition, and other factors,
it is possible to determine expected returns
and the intrinsic value of shares.
This type of analysis assumes that a shares
current (and future) price depends on its
intrinsic value and anticipated return on
investment.
contd.
The advantages of fundamental analysis are

its systematic approach and its ability to
predict changes.
Unfortunately, it becomes harder to
formalize all this knowledge for purposes of
automation (with a neural network for
example), and interpretation of this
knowledge may be subjective.
Chaotic System
A chaotic system is a combination of a deterministic

and a random process. The deterministic process
can be characterized using regression fitting, while
the random process can be characterized by
statistical parameters of a distribution function.
Thus, using only deterministic or statistical
techniques will not fully capture the nature of a
chaotic system.
A neural networks ability to capture both
deterministic and random features makes it ideal for
modeling chaotic systems.
Other techniques
Many other computer based

techniques have been employed to
forecast the stock market.
They range from charting programs to
sophisticated expert systems. Fuzzy
logic has also been used.
Comparison with Expert

Systems
Expert systems process knowledge

sequentially and formulate it into rules. In
this capacity, expert systems can be used in
conjunction with neural networks to predict
the market.
In such a combined system, the neural
network can perform its prediction, while the
expert system could validate the prediction
based on its well-known trading rules.
contd.
The advantage of expert systems is that they can explain how

they derive their results. With neural networks, it is difficult to
analyze the importance of input data and how the network
derived its results.
However, neural networks are faster because they execute in
parallel and are more fault tolerant.
It is hard to extract information from experts and formalize it in
a way usable by expert systems.
Expert systems are only good within their domain of
knowledge and do not work well when there is missing or
incomplete information.
Neural networks handle dynamic data better and can
generalize and make
"educated guesses." Thus, neural networks are more suited to
the stock market environment than expert systems.
Application of NN to Stock
Market Prediction
The networks are examined in three

main areas:
Network environment and training data
Network organization
Training a NN
A neural network must be trained on some

input data.
The two major problems in implementing
this training discussed in the following
sections are:
Defining the set of input to be used (the
learning environment)
Deciding on an algorithm to train the
network
Learning Environment
One of the most important factors in

constructing a neural network is deciding on
what the network will learn.
The goal of most of these networks is to
decide when to buy or sell securities based
on previous market indicators.
The challenge is determining which indicators
and input data will be used, and gathering
enough training data to train the system
appropriately.
contd.
The input data may be raw data on volume,

price, or daily change, but it may also
include derived data such as technical
indicators (moving average, trend-line
indicators, etc.) or fundamental indicators
(intrinsic share value, economic
environment, etc.).
The input data should allow the neural
network to generalize market behavior while
containing limited redundant data.
An Example
A comprehensive example neural network system, henceforth

called the JSE-system, modeled the performance of the
Johannesberg Stock Exchange. This system had 63 indicators,
from a variety of categories, in an attempt to get an overall view
of the market environment by using raw data and derived
indicators.
The 63 input data values can be divided into the following
classes with the number of indicators in each class in
parenthesis:
fundamental(3) - volume, yield, price/earnings
technical(17) - moving averages, volume trends, etc.
JSE indices(20) - market indices for various sectors: gold,
metals, etc.
gold price/foreign exchange rates(3)
interest rates(4)
economic statists(7) - exports, imports, etc.
contd.
The JSE-system normalized all data to the range [-1,1].

Normalizing data is a common feature in all systems as
neural networks generally use input data in the range [0,1]
or [-1,1].
It is interesting to note that although the final JSE-system
was trained with all 63 inputs, the analysis showed that
many of the inputs were unnecessary. The authors used
cross-validation techniques and sensitivity analysis to
discard 20 input values negligible effect on system
performance.
Such pruning techniques are very important because they
reduce the network size which speeds up recall and
training times. As the number of inputs to the network may
be very large, pruning techniques are especially useful.
Network Training
Training a network involves presenting

input patterns in a way so that the
system minimizes its error and
improves its performance. The training
algorithm may vary depending on the
network architecture, but the most
common training algorithm used when
designing financial neural networks is
the back-propagation algorithm.
contd.
The most common network architecture for financial

neural networks is a multilayer feedforward network
trained using backpropagation.
Back-propagation is the process of back-propagating
errors through the system from the output layer towards
the input layer during training. Back-propagation is
necessary because hidden units have no training target
value that can be used, so they must be trained based on
errors from previous layers. The output layer is the only
layer which has a target value for which to compare.
As the errors are back-propagated through the nodes, the
connection weights are changed. Training occurs until the
errors in the weights are sufficiently small to be accepted.
Over training
The major problem in training a neural network is

deciding when to stop training. Since the ability to
generalize is fundamental for these networks to
predict future stock prices, overtraining is a serious
problem.
Overtraining occurs when the system memorizes
patterns and thus looses the ability to generalize. It
is an important factor in these prediction systems as
their primary use is to predict (or generalize) on
input data that it has never seen.
Training Data
Training on large volumes of historical data

is computationally and time intensive and
may result in the network learning
undesirable information in the data set.
For example, stock market data is timedependent.
Sufficient data should be presented so that
the neural network can capture most of the
trends, but very old data may lead the
network to learn patterns or factors that are
no longer important or valuable.
Supplementary Learning
A variation on the back-propagation algorithm which

is more computationally efficient was proposed
when developing a system to predict Tokyo stock
prices. They called it supplementary learning.
In supplementary learning, the weights are updated
based on the sum of all errors over all patterns
(batch updating).
Each output node in the system has an associated
error threshold (tolerance), and errors are only
back-propagated if they exceed this tolerance. This
procedure has the effect of only changing the units
in error, which makes the process faster and able to
handle larger amounts of data.
Conclusion
Although neural networks are not

perfect in their prediction, they
outperform all other methods and
provide hope that one day we can
more fully understand dynamic,
chaotic systems such as the stock
market.
THANKS

NeuralNetworks Assign1

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

NeuralNetworks Assign1

Enviado por

Direitos autorais:

Formatos disponíveis

NEURAL NETWORKS

What is a neural network (NN)?

Neural networks is a branch of "Artificial Intelligence". Artificial Neural Network is a

A vague description is as follows:

The units are connected by unidirectional communication channels ("connections"), which

The Brain as an Information Processing

Computation in the brain

Neural Networks in the Brain

The brain is not homogeneous. At the

Neurons and Synapses

Artificial Neuron Models

Computational neurobiologists have constructed very elaborate computer models of

People have implemented model neurons in hardware as electronic circuits, often

A Simple Artificial Neuron

Its output, in turn, can serve as input to other units.

Types of Neural Networks

The McCulloch-Pitts Model of Neuron

W1,W2Wm are weight values normalized in the

where b represents the bias value.

The McCulloch-Pitts Model of Neuron

Figure :Symbolic Illustration of Linear Threshold Gate

Artificial Neuron with Continuous

Artificial Neuron with Continuous

This function is expressed

Additionally, the sigmoid function

By connecting multiple neurons, the true

To achieve higher level of computational

Multilayer network can be also viewed as

Backpropagation networks, and multi layered perceptrons, in general, are feedforward

In feedforward activation, units of hidden layer

In some implementations of the

Once activation is fed forward all the way to

The second type of error is the error at the

Computation of errors ei at a hidden layer is done as follows:

After computing for the error for each unit, whether

Learning can be done in supervised or unsupervised training.

Supervised and Unsupervised

Understanding Supervised and

Two possible Solutions

What is the class of

What groupings exist

Input : set of patterns P, from n-dimensional space S,

Unsupervised Neural Networks

Also defined Self Organizing Map

Self Organizing Map Network

All inputs are connected

neighbourhood function : hij

Result: The neurons provide a good approximation

The process of monitoring and analyzing the events

Misuse detection: patterns of well-known attacks.

Limitations of Misuse Detection

Signature database has to be manually revised for each new type of

They cannot detect emerging threats

Substantial latency in deployment of newly created signatures

False Positives alert when no attack exists. Typically, anomaly

Data Mining Based IDS and alleviate these limitations.

Why Data Mining can help?

Data Mining - Overview

IDS with Data Mining

Neural Networks and IDS

Train for known pattern of attacks

Hybrid Training Model

Identify abnormal browser access