Escolar Documentos
Profissional Documentos
Cultura Documentos
RKJHA
Introduction
Neural network is one of the important components in Artificial Intelligence
(AI).
It has been studied for many years in the hope of achieving human-like
performance in many fields, such as
1. speech and image recognition as well as
2. information retrieval.
To make the term 'neural network' used in this paper clear and to expand
considerably on its content,
it is useful to give a definition to this term,
analyze the general structure of a neural network,
and explore the advantages of neural network models first.
Summary
In summary,
neural networks are models attempt to achieve good performance via dense
interconnection of simple computational elements (Lippmann, 1987)
review
A short review of early neural network models is given by Doszkocs, Reggia and
Lin (1990), with three representative examples as follows.
Historical Sketch of Neural Networks_1940s1940sNatural components of mind-like machines are simple abstractions based on the behavior
of biological nerve cells, and such machines can be built by interconnecting such elements.
IRE Symposium
==>
F. Rosenblatt (1958)
the first practical Artificial Neural Network (ANN) - the Perceptron,
==>The Perceptron :
A Probabilistic Model for Information Storage and Organization in the Brain. .
By the end of 50s, the NN field became dormant because of the new AI advances based on
serial processing of symbolic expressions.
==> Least Mean Square (LMS) error algorithm . (1961) Widrow and his students
Generalization and Information Storage in Newtworks of Adaline Neurons.
M. Minsky & S. Papert (1969) _Perceptrons
a formal analysis of the percepton networks explaining their limitations and
indicating directions for overcoming them
==> relationship between the perceptrons architecture and what it can learn :
no machine can learn to recognize X unless it poses some scheme for representing X.
Limitations of the perceptron networks led to the pessimist view of the NN field as
having no future ==> no more interest and funds for NN research!!
Review_continued
In 1990, Widrow and Lehr (1990) reviewed the 30 years of adaptive neural networks. They gave
a description of
1. the history,
2. origination,
3. operating characteristics,and
4. basic theory of several supervised neural network training algorithms including: a.
b.
c.
d.
In his book, Haykin (1999) has provided some historical notes on neural networks
based on a year by year literature review, which including
McCulloch and Pitts's logicl neural network,
Wiener's Cybernetics,
Hebb's The Organization of Behavior, and so on.
As a conclusion,
the author claims, "perhaps more than any other publication, the 1982 paper by Hopfield and
the 1986 two-volume book by Rumelhart and McLelland were the most influential publications
responsible for the resurgence of interest in neural network in the 1980s (p. 44).
Component
of
Neural
Network
a neural network has three components:
a network,
an activation rule, and
a learning rule.
1. The network consists of a set of nodes (units) connected together via directed
links. Each node in the network has a numeric activation level associated with that
time t. The overall pattern vector of activation represents the current state of the
network at time t.
2. Activation ruleis a local procedure that each node follows in updating its activation
level in the context of input from neighboring nodes.
3. Learning rule is a local procedure that describes how the weights on connections
should be altered as a function of time.
Element of model
Similarly, Haykin (1999) thinks there are three basic elements of the neuronal model,
which include:
1. A set of synapses or connecting links, each of which is characterized by a
weight or strength of its own.
2. An adder for summing the input signals, weighed by the respective synapses
of the neuron. The operations could constitute a linear combiner.
3. An activation function for limiting the amplitude of the output of a neuron.
(p. 10)
The activation function is also referred to as a squashing function in that it squashes
(limits) the permissible amplitude range of the output signal to some finite value.
Types of activation functions include:
1) threshold function; 2) Piecewise-linear function, and 3) sigmoid function.
The sigmoid function, whose graph is s-shaped graph, is by far the most common
form of activation function used in the construction of neural networks (p.14).
Activation Function_contined
Activation functions:- the activation function acts as a squashing function, such that the output
of a neuron in a neural network is between certain values (usually 0 and 1, or -1 and 1).
general, there are three types of activation functions, denoted by (.)
Threshold Function
Piecewise-Linear function
sigmoid function
Threshold Function which takes on a value of 0 if the summed input is less than a certain
threshold value (v), and the value 1 if the summed input is greater than or equal to the
threshold value.
Activation Function_contined
Secondly, there is the Piecewise-Linear function. This function again can take
on the values of 0 or 1, but can also take on values between that depending
on the amplification factor in a certain region of linear operation.
Thirdly, there is the sigmoid function. This function can range between 0 and 1, but
it is also sometimes useful to use the -1 to 1 range. An example of the sigmoid
function is the hyperbolic tangent function.
Activation Function_contined
Activation function
Function
Identity
Definition
x
Range
(-inf,+inf)
Logistic
(0,+1)
Hyperbolic
(-1,+1)
-Exponential
(0, +inf)
Softmax
(0,+1)
Unit sum
(0,+1)
Square root
(0, +inf)
Sine
sin(x)
[0,+1]
Ramp
[-1,+1]
Step
[0,+1]
Processing units
Each unit performs a relatively simple job:
1. receive input from neighbours or external sources and use this to compute an output
signal which is propagated to other units.
2. Apart from this processing, a second task is the adjustment of the weights.
The system is inherently parallel in the sense that many units can carry out their
computations at the same time.
Within neural systems it is useful to distinguish three types of units:
1. input units (indicated by an index i) which receive data from outside the neural
network,
2. output units (indicated by an index o) which send data out of the neural network,
and
3. hidden units (indicated by an index h) whose input and output signals remain within
the neural network.
During operation, units can be updated either synchronously or asynchronously. With
synchronous updating, all units update their activation simultaneously; with asynchronous
updating, each unit has a (usually fixed) probability of updating its activation at a time t, and
usually only one unit will be able to do this at a time. In some cases the latter model has some
advantages.