Escolar Documentos
Profissional Documentos
Cultura Documentos
Edward Gatt
What are Neural Networks?
• Models of the brain and nervous system
• Highly parallel
– Process information much more like the brain than a
serial computer
• Learning
• Applications
– As powerful problem solvers
– As biological models
ANNs – The basics
• ANNs incorporate the two fundamental
components of biological neural nets:
1. Neurones (nodes)
2. Synapses (weights)
• Neurone vs. Node
• Structure of a node:
Information is distributed
1
Squashing: 0. 5
= 0.3775
1+ e
Supervised Vs. Unsupervised
• Networks can be ‘supervised’
– Need to be trained ahead of time with lots of data
• Unsupervised networks adapt to the input
– Applications in Clustering and reducing
dimensionality
– Learning may be very slow
What can a Neural Net do?
• Compute a known function
• Approximate an unknown function
• Pattern Recognition
• Signal Processing
outputs
Number of inputs/outputs is
Neural Network
variable
W0 W1 ... Wn
• A node is an element
which performs the
Wb + +
function
fH(x)
y = fH(∑(wixi) + Wb)
Connection
Output
Node
Simple Perceptron
• Binary logic application Input 0 Input 1
Wb +
• Y = u(W0X0 + W1X1 + Wb)
fH(x)
• η = Learning Rate
• D = Desired Output
• Using a nonlinear
function which
approximates a linear
threshold allows a
network to approximate
nonlinear functions
Alternative Activation functions
• Radial Basis Functions
– Square
– Triangle
– Gaussian!
...
• (μ, σ) can be varied at Input 0 Input 1 Input n
H1 H2 H3 … HJ f(neth)
I1 I2 … II
input vector x
Supervised Learning in the BPN
•Before the learning process starts, all weights (synapses)
in the network are initialized with pseudorandom
numbers.
•We also have to provide a set of training patterns
(exemplars). They can be described as a set of ordered
vector pairs {(x1, y1), (x2, y2), …, (xP, yP)}.
•Then we can start the backpropagation learning
algorithm.
•This algorithm iteratively minimizes the network’s error
by finding the gradient of the error surface in weight-
space and adjusting the weights in the opposite direction
(gradient-descent technique).
Supervised Learning in the BPN
•Gradient-descent example: Finding the absolute
minimum of a one-dimensional error function f(x):
f(x)
slope: f’(x0)
x0 x1 = x0 - ηf’(x0) x
Repeat this iteratively until for some xi, f’(xi) is sufficiently close to 0.
Supervised Learning in the BPN
• In the BPN, learning is performed as follows:
1. Randomly select a vector pair (xp, yp) from the training
set and call it (x, y).
2. Use x as input to the BPN and successively compute
the outputs of all neurons in the network (bottom-up)
until you get the network output o.
3. Compute the error δopk, for the pattern p across all K
output layer units by using the formula:
o o
δ pk = ( yk − ok ) f ' (net )
k
Supervised Learning in the BPN
4. Compute the error δhpj, for all J hidden layer units by
using the formula:
K
δ pjh = f ' (netkh )∑ δ pko wkj
k =1
5. Update the connection-weight values to the hidden layer by using the following
equation:
h
w ji (t + 1) = w ji (t ) + ηδ x pj i
Supervised Learning in the BPN
6. Update the connection-weight values to the output layer by using the following
equation:
o
wkj (t + 1) = wkj (t ) + ηδ pk f (net hj )
1
f (net k ) =
1 + e − net k
∂f (net k )
f ' (net k ) = = ok (1 − ok )
∂net k
Supervised Learning in the BPN
Now our BPN is ready to go!
If we choose the type and number of neurons in our network appropriately, after
training the network should show the following behavior:
• If we input any of the training vectors, the network
should yield the expected output vector (with some
margin of error).
• If we input a vector that the network has never
“seen” before, it should be able to generalize and
yield a plausible output vector based on its
knowledge about similar input vectors.
Self-Organizing Maps (Kohonen Maps)
In the BPN, we used supervised learning.
This is not biologically plausible: In a biological system, there is no external
“teacher” who manipulates the network’s weights from outside the network.
Biologically more adequate: unsupervised learning.
We will study Self-Organizing Maps (SOMs) as examples for unsupervised learning
(Kohonen, 1980).
Self-Organizing Maps (Kohonen Maps)
Such topology-conserving mapping can be achieved by SOMs:
• Two layers: input layer and output (map) layer
• Input and output layers are completely connected.
• Output neurons are interconnected within a defined
neighborhood.
• A topology (neighborhood relation) is defined on
the output layer.
Self-Organizing Maps (Kohonen Maps)
A neighborhood function φ(i, k) indicates how closely neurons i and k in the output
layer are connected to each other.
Usually, a Gaussian function on the distance between the two neurons in the layer
is used:
position of k
position of i
Unsupervised Learning in SOMs
For n-dimensional input space and m output neurons: