Você está na página 1de 25

Neural Networks

(In its most general form)


a neural network is a machine that is designed to model the way in which the
brain performs a particular task or function of interest; the network is usually
implemented by using electronic components or is simulated in software on a
digital computer.
Dendrites

RKJHA

Introduction
Neural network is one of the important components in Artificial Intelligence
(AI).
It has been studied for many years in the hope of achieving human-like
performance in many fields, such as
1. speech and image recognition as well as
2. information retrieval.
To make the term 'neural network' used in this paper clear and to expand
considerably on its content,
it is useful to give a definition to this term,
analyze the general structure of a neural network,
and explore the advantages of neural network models first.

The Definition of Neural Network


In (1990), according to Miller :-"Neural networks are also called neural nets, connectionist models, collective
models, parallel distributed processing models, neuromorphic systems, and artificial
neural networks by various researchers (p.1-4)."
Similarly, in his article (1987), Lippmann states:Artificial neural net models or simple 'neural nets' go by many names such as
connectionist models, parallel distributed processing models, and neuro-morphic
systems (p.4)."
However, Doszkocs and his coworkers (1990) think connectionist models are
more general than neural network models and
"they include several related information processing approaches, such as
artificial neural networks, spreading activation models, associative networks,
and parallel distributed processing (p. 209)."
In their mind, "early connectionist models were called neural network models
because they literally tried to model networks of brain cells (neurons) (p. 212)".

The Definition of Neural Network


A neural network model (or neural model) as that term is used refers to a connectionist
model that simulates the biophysical information processing occurring in the nervous
system. So, even though connectionist models and neural network models have same
meaning in some literature, we prefer to regard connectionist models as a more
general concept and neural networks is a subgroup of it.
In (1999) ,A preliminary definition of neural network is given by Kevin Gurney:A Neural Network is an interconnected assembly of simple processing elements,
units or nodes, whose functionality is loosely based on the animal neuron. The
processing ability of the network is stored in the inter-unit connection strengths,
or weights, obtained by a process of adaptation to, or learning from, a set of
training patterns.

The Definition of Neural Network


Hecht-Neilsen Neuro-computer Corporation provides the following definition for
neural network:
A cognitive information processing structure based upon models of brain function.
In a more formal engineering context: a highly parallel dynamical system with the
topology of a directed graph that can carry out information processing by means of
its state response to continuous or initial input (as cited in Miller, 1990, p. 1-3).
Haykin (1999) has offered the following definition of a neural network viewed as an
adaptive machine:
A neural network is a parallel distributed processor made up of simple processing
units, which has a natural propensity for storing experiential knowledge and making
it available for use.
It resembles the brain in two respects:
1. Knowledge is acquired by the network from its environment through a learning
process;
2. Interneuron connection strengths, known as synaptic weights, are used to store
theacquired knowledge (p. 2).

Summary

In summary,
neural networks are models attempt to achieve good performance via dense
interconnection of simple computational elements (Lippmann, 1987)

review
A short review of early neural network models is given by Doszkocs, Reggia and
Lin (1990), with three representative examples as follows.

Networks based on logical neurons.These are earliest neural network


models. A Logical neuron is a binary state device, which is either off or
on. There is no mechanism for learning and a network for a desired
input-output relationship must be designed manually.

Elementary Perceptron. This is a kind of neural networks developed


during the 1950s and 1960s, which learns through changes of synaptic
strength. Given any set of input patterns and any desired classification of
patterns, it is possible toconstruct an elementary perceptron that will
perform the desired classification.The crowning achievement of work on
elementary perceptrons is the perceptron convergence theorem.
Linear networks. These are another class of neural models developed
primarily during the 1960s and 1970s. Much work on linear networks has
focused on associative memories.

Historical Sketch of Neural Networks_1940s1940sNatural components of mind-like machines are simple abstractions based on the behavior
of biological nerve cells, and such machines can be built by interconnecting such elements.

W. McCulloch & W. Pitts (1943) the first theory on the fundamentals of


neural computing (neuro- logicalnetworks )
A Logical Calculus of the Ideas Immanent in Nervous Activity

==> McCulloch-Pitts neuron model; (1947)


How We Know Universals
- an essay on networks capable of recognizing spatial patterns invariant of
geometric transformations.
Cybernetics:
attempt to combine concepts from biology, psychology, mathematic s, and
engineering.

Historical Sketch of Neural Networks_1940s_Continued

D.O. Hebb (1949)


The Organization of Behavior
the first theory of psychology on conjectures about neural networks
==>(neural networks might learn by constructing internal representations of
concepts in the form of cell - assemblies - subfamilies of neurons that would
learn to support one another s activities).

==> Hebbs learning rule:


==> When an axon of cell A is near enough to excite a cell B and repeatedly or
persistently takes part in firing it, some growth process or metabolic change takes
place in one or both cells such that A s efficiency, as one of the cells firing B, is
increased.

Historical Sketch of Neural Networks_1950s


Cybernetic machines developed as specific architectures to perform specific functions.
==> machines that could learn to do things they arent built to do

M. Minsky (1951) built a reinforcement -based network learning system.

IRE Symposium
==>

The Design of Machines to Simulate the Behavior of the


Human Brain

(1955)with four panel members:


W.S. McCulloch, A.G. Oettinger, O.H. Schmitt, N. Rochester,
invited questioners:
M. Minsky, M. Rubinoff, E.L. Gruenberg, J. Mauchly, M.E. Moran, W. Pitts,
and the moderator H.E. Tompkins.

F. Rosenblatt (1958)
the first practical Artificial Neural Network (ANN) - the Perceptron,
==>The Perceptron :
A Probabilistic Model for Information Storage and Organization in the Brain. .
By the end of 50s, the NN field became dormant because of the new AI advances based on
serial processing of symbolic expressions.

Historical Sketch of Neural Networks_1960s_Continued


Connectionism (Neural Networks) - versus - Symbolism (Formal Reasoning)
B. Widrow & M.E. Hoff (1960) _Adaptive Switching Circuits
presents an adaptive percepton like network.
The weights are adjusted so to minimize the mean square error between the actual and
desired output

==> Least Mean Square (LMS) error algorithm . (1961) Widrow and his students
Generalization and Information Storage in Newtworks of Adaline Neurons.
M. Minsky & S. Papert (1969) _Perceptrons
a formal analysis of the percepton networks explaining their limitations and
indicating directions for overcoming them
==> relationship between the perceptrons architecture and what it can learn :
no machine can learn to recognize X unless it poses some scheme for representing X.
Limitations of the perceptron networks led to the pessimist view of the NN field as
having no future ==> no more interest and funds for NN research!!

Review_continued
In 1990, Widrow and Lehr (1990) reviewed the 30 years of adaptive neural networks. They gave
a description of
1. the history,
2. origination,
3. operating characteristics,and
4. basic theory of several supervised neural network training algorithms including: a.
b.
c.
d.

the perceptron rule


the LMS (least mean square) algorithm,
three Madaline rules, and
the back propagation techniques.

In his book, Haykin (1999) has provided some historical notes on neural networks
based on a year by year literature review, which including
McCulloch and Pitts's logicl neural network,
Wiener's Cybernetics,
Hebb's The Organization of Behavior, and so on.
As a conclusion,
the author claims, "perhaps more than any other publication, the 1982 paper by Hopfield and
the 1986 two-volume book by Rumelhart and McLelland were the most influential publications
responsible for the resurgence of interest in neural network in the 1980s (p. 44).

Advantages of Neural Network Models over Traditional IR


Models
In neural network models,
information is represented as a network of weighted, interconnected nodes.
In contrast to traditional information processing methods,
neural network models are "self-processing" in that no external program operates on the network:
the network literally processes itself, with "intelligent behavior" emerging from the local interactions
that occur concurrently between the numerous network components (Reggia & Sutton, 1988).
According to Doszkocs, Riggia and Lin (1990), neural network models in general are fundamentally
different from traditional information processing models in at least two
ways.
First they are self-processing. Traditional information processing models typically make use of
a passive data structure, which is always manipulated by an active external process/procedure.
In contrast, the nodes and links in a neural network are active processing agents. There is
typically no external active agent that operates on them. "Intelligent behavior" is a global
property of neural network models.
Second, neural network models exhibit global system behaviors derived from on current local
interactions on their numerous components. The external process that manipulated the
underlying data structures in traditional IR models typically has global access to the entire
network/rule set, and processing is strongly and explicitly sequentialized (Doszkocs, Reggia &
Lin, 1990).

Advantages of Neural Network Models over Traditional IR


Models_Continued
Pandya and Macy (1996) have summarized that neural networks are natural
classifiers with significant and desirable characteristics, which include but no limit to
the follows.
Resistance to noise
Tolerance to distorted images/patterns (ability to generalize)
Superior ability to recognize partially occluded or degraded images
Potential for parallel processing

Advantage and diadvantage


Advantages:
A neural network can perform tasks that a linear program can not.
When an element of the neural network fails, it can continue without any
problem by their parallel nature.
A neural network learns and does not need to be reprogrammed.
It can be implemented in any application.
It can be implemented without any problem.
Disadvantages:
The neural network needs training to operate.
The architecture of a neural network is different from the architecture of
microprocessors therefore needs to be emulated.
Requires high processing time for large neural networks.

Component
of
Neural
Network
a neural network has three components:
a network,
an activation rule, and
a learning rule.
1. The network consists of a set of nodes (units) connected together via directed
links. Each node in the network has a numeric activation level associated with that
time t. The overall pattern vector of activation represents the current state of the
network at time t.
2. Activation ruleis a local procedure that each node follows in updating its activation
level in the context of input from neighboring nodes.
3. Learning rule is a local procedure that describes how the weights on connections
should be altered as a function of time.

Element of model
Similarly, Haykin (1999) thinks there are three basic elements of the neuronal model,
which include:
1. A set of synapses or connecting links, each of which is characterized by a
weight or strength of its own.
2. An adder for summing the input signals, weighed by the respective synapses
of the neuron. The operations could constitute a linear combiner.
3. An activation function for limiting the amplitude of the output of a neuron.
(p. 10)
The activation function is also referred to as a squashing function in that it squashes
(limits) the permissible amplitude range of the output signal to some finite value.
Types of activation functions include:
1) threshold function; 2) Piecewise-linear function, and 3) sigmoid function.
The sigmoid function, whose graph is s-shaped graph, is by far the most common
form of activation function used in the construction of neural networks (p.14).

Activation Function_contined
Activation functions:- the activation function acts as a squashing function, such that the output
of a neuron in a neural network is between certain values (usually 0 and 1, or -1 and 1).
general, there are three types of activation functions, denoted by (.)
Threshold Function
Piecewise-Linear function
sigmoid function
Threshold Function which takes on a value of 0 if the summed input is less than a certain
threshold value (v), and the value 1 if the summed input is greater than or equal to the
threshold value.

Activation Function_contined
Secondly, there is the Piecewise-Linear function. This function again can take
on the values of 0 or 1, but can also take on values between that depending
on the amplification factor in a certain region of linear operation.

Thirdly, there is the sigmoid function. This function can range between 0 and 1, but
it is also sometimes useful to use the -1 to 1 range. An example of the sigmoid
function is the hyperbolic tangent function.

Activation Function_contined

Activation function
Function
Identity

Definition
x

Range
(-inf,+inf)

Logistic

(0,+1)

Hyperbolic

(-1,+1)

-Exponential

(0, +inf)

Softmax

(0,+1)

Unit sum

(0,+1)

Square root

(0, +inf)

Sine

sin(x)

[0,+1]

Ramp

[-1,+1]

Step

[0,+1]

A framework for distributed


representation
An artifcial neural network consists of a pool of simple processing units which
communicate by sending signals to each other over a large number of weighted
connections.
A set of major aspects of a parallel distributed model can be distinguished :
1. a set of processing units ('neurons,' 'cells');
2. a state of activation yk for every unit, which equivalent to the output of the unit;
3. connections between the units. Generally each connection is defined by a weight
wjk which determines the effect which the signal of unit j has on unit k;
4. a propagation rule, which determines the effective input sk of a unit from its
external inputs;
5. an activation function Fk, which determines the new level of activation based on
the efective input sk(t) and the current activation yk(t) (i.e., the update);
6. an external input (aka bias, offset) k for each unit;
7. a method for information gathering (the learning rule);
8. an environment within which the system must operate, providing input signals
and|if necessary|error signals.

Processing units
Each unit performs a relatively simple job:
1. receive input from neighbours or external sources and use this to compute an output
signal which is propagated to other units.
2. Apart from this processing, a second task is the adjustment of the weights.
The system is inherently parallel in the sense that many units can carry out their
computations at the same time.
Within neural systems it is useful to distinguish three types of units:
1. input units (indicated by an index i) which receive data from outside the neural
network,
2. output units (indicated by an index o) which send data out of the neural network,
and
3. hidden units (indicated by an index h) whose input and output signals remain within
the neural network.
During operation, units can be updated either synchronously or asynchronously. With
synchronous updating, all units update their activation simultaneously; with asynchronous
updating, each unit has a (usually fixed) probability of updating its activation at a time t, and
usually only one unit will be able to do this at a time. In some cases the latter model has some
advantages.

Neural Network topologies


In the previous section we discussed the properties of the basic processing unit in
an artificial neural network. This section focuses on the pattern of connections
between the units and the propagation of data. As for this pattern of connections,
the main distinction we can make is between:
Feed-forward neural networks
Recurrent neural networks
Feed-forward neural networks, where the data ow from input to output units is
strictly feedforward.
The data processing can extend over multiple (layers of) units, but no feedback
connections are present, that is, connections extending from outputs of units to
inputs of units in the same layer or previous layers.
Recurrent neural networks that do contain feedback connections.
Contrary to feed-forward networks, the dynamical properties of the network are
important. In some cases, the activation values of the units undergo a relaxation
process such that the neural network will evolve to a stable state in which these
activations do not change anymore. In other applications, the change of the
activation values of the output neurons are significant, such that the dynamical
behaviour constitutes the output of the neural network (Pearlmutter, 1990).

Artificial Neural Networks


Architectures
Artificial Neural networks aremathematical entities that are modeled after existing
biological neurons found in the brain. All the mathematical models are based on the
basic block known as artificial neuron.
A simpleneuron isshown in figure 1. This is a neuron with a single R-element input
vector is shown below. Here the individual element inputs p1,p2,,pn

Você também pode gostar