Unit-I & II Slides

Neural Networks
Unit I & II (of a Total of VIII units)

K Raghu Nathan Retd Dy Controller (R&D)
Topics covered in this Unit

Biological Neural Networks Computers & Biological Neural Networks Models of Neuron [Artificial Neurons] ANN Terminology Artificial Neural Networks Historical Development of NN Principles
Topics covered [contd]

ANN Topologies ANN Functional Usage Pattern Recognition Tasks Learning in ANNs Basic Learning Laws
Biological Neural Networks

Nervous System
Complex system of interconnected nerves
Made up of Nerve Cells called Neurons Neurons Receive & Transmit information between various parts/organs of the body Sensory (Receptor) Neuron, Motor Neuron, Inter-Neuron Transmission of signal is a complex electro-chemical process
The Biological Neuron
Biological Neuron [contd]
Biological Neuron
Cell Body Soma
Has a Nucleus
Dendrites
Fiber-like; large number; branched structure Receive signals from other neurons
Axon
One per neuron; longer & thicker; branched at its end Transmits signals to other neurons Contains Vesicles, which hold chemical substance called neural transmitters
Biological Neuron [contd]

Synapse Synaptic Cleft Synaptic Gap
Junction of axon & dendrites
Pre-synaptic neuron
Transmitting neuron
Post-synaptic neuron
Receiving neuron
The Synapse
Neuron Signals
Complex electro-chemical process Incoming signals raise or lower the electrical potential inside the neuron If potential crosses a threshold, a short electrical pulse is produced
We say the neuron fires [is triggered or activated]
The pulse is sent down the axon electrical activity inside the neurons Chemical activity occurs at the synapses Vesicles in the axon release chemical substance, called neural transmitters These are collected by dendrites of receiving neuron This raises/lowers electric potential in the receiving neuron
Neuron Signals
Each neuron receives lot of input signals thru its dendrites
from many other neurons
Sends an output signal thru its axon

to many other neurons
Output depends on all inputs Cell body acts like a summing & processing device Process depends on type of neuron
Characteristics of Biological NN
Robustness & Fault Tolerance
Decay of nerve cells does not seem to affect performance significantly
Flexibility
Automatically adjusts to new environment
Ability to deal with wide variety of situations

Uncertain, Vague, Inconsistent, Noisy
Collective Computation
Massively Parallel Distributed
Aspect
Computer
Biological NN
Speed Processing Size & Complexity Storage
Numeric: Faster Patterns: Slower Sequential Less complex in memory locations; addressable; fixed capacity; new info overwrites old info no centralized
Slower Faster Massively Parallel Very Complex In the strengths of the interconnections; Adaptable size, to add new info yes distributed
Fault Tolerance Control Mechanism
Artificial Neuron - Neuron Models Mathematical Models of Neuron

M & P model Perceptron Adaline Madaline Neocognitron
McCulloch & Pitts Model

inputs weights w1 w2 wi wn [b = bias]
Activation
Output signal
value
a1
a2 ai an
aiwi + b
f(x)
output function
s
[s= f(x)]
Summing part
Output part
Output Function
Binary : if x>=t, s=1 else, s=0 (t = threshold)
1 0 x t s
Linear :
s x s=kt
0
Ramp :
Sigmoid : s = 1/(1+e-x)
NOR gate, using the M&P model
a1 a2 a1 a2 -1 s 0 0 0 1
x 1 0
s 1 0
-1
1
1
0
1
0
-1
0
0
Perceptron
Inputs are first processed by Association Units Weights are adjustable, to enable Learning Actual output is compared with desired output; the difference is Error Error is used to adjust the weights, to obtain desired output
Perceptron (contd)
a1 w1
w2 a2
aiwi + b
x f(x) s
w3 a3 Association Sensory units units Summing unit
Output function
Perceptron (contd)
Expected output = s Actual output = s Error = = s s Weight change = wi = ai is the Learning Rate parameter
Perceptron Learning
Perceptron Learning Rule
Procedure for adjusting the weights
If weight adjustments lead to zero-error, we say it converges Whether error reduces to 0, depends on nature of desired input-output pairs of data Perceptron Convergence Theorem
To determine whether desired input-output pairs are representable [achievable]
Adaline
Adaline = Adaptive Linear Element Similar to Perceptron; difference is :
Employs Linear Output Function (s=x)
Weight update rule minimises the mean squared error, averaged over all inputs
Hence known as LMS (Least Mean Squared) Error Learning Rule Also known as : Gradient Descent Algorithm
Terminology
Processing Unit
Summing part, output part Inputs, weights, bias, activation value Output function, output signal
Interconnections various Topologies Operations

Activation Dynamics, Learning Laws
Update
Synchronous, Asynchronous
Artificial Neural Networks

It is possible to create models of the biological neurons as processing units
and link them to form closely interconnected networks
Models may be electronic / software Such networks are called Artificial Neural Networks [ANN]
ANN
ANNs exhibit abilities surprisingly similar to Biological NNs They can Learn, Recognize, Remember, Match & Retrieve Patterns of Information Hardware implementations of ANN are also available nowadays
Costly but faster than software implementation
Historical Development of ANN

1943 - McCulloch & Pitts Model of Neuron 1949 - Hebbian Learning Law 1958 - Rosenblatts Model Perceptron 1960 Widrow & Hoff Adaptive Linear Element [Adaline] & Least Mean Squared [LMS] Error Learning Law 1969 Minsky & Papert - Multilayer Perceptron 1971 Kohonen Associative Memory 1971 Wilshaw Self-Organization 1974 Werbos Error Backpropagation
Historical Development of ANN [contd]

1976 Grossberg Adaptive Resonance Theory [ART] 1980 Fukushima - Neocognitron 1982 Hopfield Energy Analysis 1985 Sejnowski Boltzmann Machine 1987 Nielsen Counterpropagation [CPN] 1988 Kosko Bidirectional Associative Memory [BAM] 1988 Broomhead Radial Basis Function [RBF]
Topology
Topology is the physical organisation of the ANN Arrangement of the processing units, interconnections & pattern input & output ANN is made up of Layers of Neurons All Neurons within one layer have same activation dynamics & output function In addition to interlayer connections, intralayer connections may also be made Connections across the layers may be in feedforward or feed-back manner
Topology (contd)
One Input layer, one output layer zero or more intermediate layers (usually referred as hidden layers) No limit on no. of layers There can be any no. of neurons in any layer; all layers need not have same no. of neurons If there is no hidden layer, the ANN is called single-layer network If one or more hidden layers are present, it is called multi-layer network
Topology (contd)
Feedforward Networks
the units are connected such that data flows only in forward direction, ie. from input layer to output layer, via successive hidden layers if any
Feedback Networks
data flows in forward direction, as above in addition, the connections allow data flow from output layer towards input layer also the reverse flow (feedback) is for errorcorrection, for adjusting weights suitably to get desired output, which is an essential feature of the mechanism for NN Learning
Single Layer FF Network
Output layer Input layer
Multilayer Feedforward Network
Output Input layer Hidden layers layer
Feedback Network
Neuronal Dynamics
Operation of NN governed by Neuronal Dynamics
Dynamics of activation state Dynamics of synaptic weights
Short term Memory (STM) modelled by activation state of the NN Long Term Memory (LTM) corresponds to encoded pattern of info in synaptic weights
Applications of Artificial Neural Networks

Advance Robotics
Intelligent Control Technical Diagnisti cs Intelligent Data Analysis and Signal Processing
Machine Vision
Artificial Intellect with Neural Networks
Image & Pattern Recognition Intelligent l Medicine Devices Intelligent Security Systems
Intelligent Expert Systems
42
Major Areas of Usage

Pattern Recognition Tasks These tasks necessarily involve
Learning Memory Information Retrieval
Patterns
Computers deal with Data Humans deal with Patterns Objects/Images, voices/sounds, even actions [walking etc] have patterns Different images, sounds & actions have different patterns Patterns enable us to recognise, classify & identify objects & to take decisions based on such identification
Pattern Recognition Tasks

Pattern Association Pattern Classification
Pattern Mapping
Pattern Clustering (aka Pattern Grouping) Feature Mapping
Pattern Association
Every input pattern is associated with an output pattern, to form a pair of input-output patterns There will be many such pairs of input-output patterns A well-designed ANNs can be trained to learn (remember) many such pairs of patterns Whenever a pattern is input, the ANN should retrieve (output) the corresponding output pattern Supervised Learning has to be employed [being taught] This is purely a memory function & is called auto-association task
Pattern Association (contd)

Desirable : even if the input pattern is incomplete or noisy [ie. contains some errors], we should get correct output pattern Among the various input patterns in its memory, the ANN should select one pattern which is closest to the test input & the corresponding output pattern should be output by the ANN This needs content addressable memory & the process is called accretive behaviour Example of Pattern Association task : OCR of printed characters
Pattern Classification
Objects belonging to the same class have many common features/patterns This fact enables us to classify objects into classes & to identify new classes Supervised Learning the patterns for each class has to be taught to the system Pattern classification tasks must exhibit accretive behaviour ie. an incomplete or noisy input should poduce an output corresponding to its closest known input pattern Example of Pattern Classification task: Voice Recognition, Handwriting Recognition
Pattern Mapping
Capturing the relation between the input pattern & its corresponding output pattern This is a generalisation task, not mere memorising This is called interpolative behaviour Example of Pattern Mapping task: Speech Recognition
Pattern Clustering
Identifying subsets of patterns having similar distinctive features & grouping them together Sounds similar to Pattern Classification, but is not the same Has to employ Unsupervised Learning
Classification
Patterns for each class are input separately That is, system is trained to learn patterns of one class first Then it is taught the patterns of another class
Clustering
Patterns belonging to several groups are mixed in the set of inputs System has to resolve them into different groups
Feature Mapping
In several patterns, the features may not be unambiguous May vary over a time-period Therefore, difficult to cluster In this case, system learns a feature map rather than clustering or classifying Has to employ unsupervised learning Example: you see a new object - for the first time never seen it before - & it has some distinct features, as well some features common to many known classes or groups
Pattern Recognition Problem

In any pattern recognition task, we have a set of input patterns & a set of desired output patterns Depending on the nature of desired output patterns & the nature of the task environment, the problem would be one of the following three types:
Pattern Association Problem Pattern Classification Problem Pattern Mapping Problem
Pattern Association Problem

Problem: to design an ANN Input-output pairs are (a1,b1), (a2, b2), (a3, b3), ., (aL,bL) al = (al1,al2,,alM) & bl = (bl1, bl2,,blN) are vectors of dimensions M & N The ANN should associate the input patterns with the corresponding output patterns
Pattern Association Problem (contd)

If al & bl are distinct, the problem is heteroassociative If al = bl, it is auto-associative; al=bl means M=N, the input & output patterns both refer to the same point in a N dimensional space Storing the association of the pairs of input & output patterns = deciding the weights in the network, by applying the operations of the network on the input pattern
Pattern Association Problem (contd)

If a given input pattern = same as what was used for training the network, the output pattern = same as what was used during training If input pattern is slightly different (incomplete or noisy), output may also be different If actual input a = al + [ = noise vector]
If output is bl [as desired] NW is showing acretive behaviour If output is b = bl + , and 0 as 0, NW is showing interpolative behaviour
Basic Functional Units

Basic functional unit = simplest form in the 3 types of NN viz. FF, FB & Combination NWs Simplest FF NN is a single-layer NW
Simplest FB NN has N units, each connected to all others & to itself
Simplest Combination of FF & FB NW [aka Competitive Learning NW] is a singlelayer NW in which the units in output layer have feedback connections among themselves
Types of ANN & their suitable tasks

FF NN
Pattern Association, Classification & Mapping
FB NN
Auto-Association, Pattern Storage (LTM), Pattern Environment Storage (LTM)
FF & FB (CL) NN
Pattern Storage (STM), Clustering & Feature Mapping
FF NN Pattern Association
a1 a2 a5 a3 a6 a4 b4 b1
b2
b3
For input pattern ai, the corresponding output pattern is bi. a5 & a6 are noisy versions of a3. In a5 the noise is less, it is nearest to a3 - NW outputs b3 [desired], it is accretive. In a6 noise is more, it is nearer to a4 than a3 NW may output b4.
Real-Life Example
Inputs are 8x8 grids of pixels of binary values. Input pattern space is a binary 256-dimensional space.
A
B
A 1000001 B 1000010 . . . Z 1011010
Outputs are 5-bit binary numbers (7-bit ascii characters). Output pattern space is binary 5-dimensional space.
Noisy versions of input patterns can occur, when some values of some pixels get changed, due to noise in transmission channel or dust/stain spots on the document being scanned.
FF NN Pattern Classification
Some of the output patterns may be identical So, a set of input patterns may correspond to the same output pattern No. of distinct output patterns = a class label Input patterns corresponding to each class = samples of that class In such cases, the NN has to classify the input patterns That is: for each input pattern, the NN should identify the class [output pattern] to which it belongs
Real-Life Example
A 1000001
B 1000010
CL NN Pattern Classification
Accretive behaviour
FF NN Pattern Mapping
NN is trained with some pairs of input-output patterns, not all possible pairs When a new input pattern is given, the NN is made to find the coresponding output pattern [though the NN was not trained with this pair] Suppose the NN has been trained with i/o pairs a n & bn If the new input pattern am is closer to some known input pattern am, the NN tries to find an output pattern b which is closer to bn Interpolative behaviour
Pattern Mapping Action

a1 a2 a6 b6 b3 b4 b5 b1 b2
a3
a4 a5
NN trained with (a1,b1) to (a5,b5) only; not trained with (a6,b6) pair. a6 closer to a3; so, NN maps it on to b6, which is closer to b3.
FB NN Pattern Association
If input patterns are identical to output patterns, input & output spaces are identical Problem reduces to auto-association
trivial; the NW merely stores the input patterns
If a noisy pattern arrives at input, NW outputs the same noisy pattern as output
Absence of accretive behaviour
FB NN Pattern Association (contd)
a1 a2 a5
a1 a2 a5
a3
a4
a3
a4
FB NN Pattern Storage (LTM)

Auto-association with accretive behaviour Input patterns are stored; stored patterns can be retrieved by a noisy/approximate input pattern also Very useful in practice Two possibilities :
Stored patterns = same as input patterns; input pattern space is continuous; output pattern space is a set of fixed finite set of patterns that are stored Stored patterns = some transformed versions of input patterns; output space has same dimensions as input space
FB NN Pattern Storage (contd)
FB NN Pattern Environment Storage

Pattern Environment = a set of patterns + the probabilities of their occurrence NW is designed to recall the patterns with lowest probability of error More about this in Unit-VII
CL NN Pattern Storage (STM)

STM = short term memory = temporary storage Given input [as it is or a transformed version] is stored As long as same pattern is input, the stored pattern is recalled When new pattern is input, stored pattern is lost new pattern is stored Such NW can be studied on academic interest only not of practical use
CL NN Pattern Clustering
Patterns are grouped, based on similarities Input is an individual pattern; ouput is the pattern of group to which the input belongs That is : a group of approximately similar patterns are identified with one & the same cluster label & will produce the same output pattern Two types possible : new input pattern, not belonging to any group, is
forced to one of the groups (Accretive behaviour) shown as belonging to a new group
Input is close to some known input pattern x New group is close to xs group (Interpolative behaviour)
CL NN Pattern Clustering (contd)
Interpolative behaviour
CL NN Feature Mapping
Similar to clustering; difference is: Similar inputs produce similar output [not the same output] similarities of inputs is retained at the output No accretive behaviour; only interpolative Output patterns are much larger [than for clustering]
Types of Learning
Supervised Learning
desired response [output] of the system is already known/decided network is made [tuned/trained] to give the desired output it is as if the network is "being taught or trained by a teacher"
Unsupervised Learning
output is not known network is allowed to settle into stable state by itself It is as if the network is discovering special features and patterns from available data without using an external help
Types of Learning (contd)

Reinforcement Learning
Bridges the gap between supervised & unsupervised methods Output is not known System receives feedback from environment Reward for correctness Punishment for error System adapts its parameters based on this feedback
Learning Equation
Implementation of Synaptic Dynamics Expression for updating of weights Express the weight vector of ith processing unit at time instant t+1, in terms of that weight vector at time instant t wi(t+1) = wi(t) + wi(t) wi(t) is the change in the weight vector Different researchers have proposed different expressions for calculating wi(t); these are called Learning Laws
Learning Laws
Hebbs Law [Hebbian Learning Law] Perceptron Learning Law Delta Learning Law LMS Learning Law Correlation Learning Law Instar [winner-take-all] Learning Law Outstar Learning Law
Boltzmann Learning
Stochastic Learning Algorithm A Network designed to apply Boltzmann Learning Rule is called Boltzmann Machine The neurons constitute a recurrent structure & give binary output [+1 or -1] corresponding to whether the neuron is on or off
Memory-based Learning
Past experiences = patterns which the NN has been trained to recognise/classify Each experience is a pair of input & output patterns All or most of the past experiences are stored in a large memory Any new input pattern can be compared with patterns stored in memory & the corresponding output pattern can be output
Memory-based Learning (contd)

Memory-based learning algorithms involve 2 essential ingredients Criteria applied to define local neighbourhood [patterns which are similar] Learning rule applied for training the NN Algorithms will differe based on how these 2 ingredients are defined
Summary of Learning Laws

See table 1.2 on page 35 of Yegnanarayanas book
Learning Law
Weight Update (Wi) Formula (for j = 1, 2, ..., M) siaj (bi - si) aj
Initial Weights
Type of Learning
Remarks
Hebbian
Near zero
Unsupervised
Perceptron
Random
Supervised
Bipolar Output Functions
Delta
(bi - si) f(xi) aj
Random
Supervised
Widrow-Hoff (LMS)
[bi - wiTa] aj
Random
Supervised
Correlation
bi aj (aj wkj)
Near zero
Supervised
Winner-Take-All (Instar)
Random, but normalised
Unsupervised
Competitive Learning; K is the Winning Unit
Outstar
(bi wjk)
Zero
Supervised
Grossberg Learning
End of Units I & II

Unit-I &amp; II Slides

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Unit-I &amp; II Slides

Enviado por

Direitos autorais:

Formatos disponíveis

Neural Networks

Unit I & II (of a Total of VIII units)

Topics covered in this Unit

Topics covered [contd]

Biological Neural Networks

The Biological Neuron

Biological Neuron [contd]

Biological Neuron [contd]

Sends an output signal thru its axon

Ability to deal with wide variety of situations

Speed Processing Size & Complexity Storage

Fault Tolerance Control Mechanism

Artificial Neuron - Neuron Models Mathematical Models of Neuron

McCulloch & Pitts Model

NOR gate, using the M&P model

w3 a3 Association Sensory units units Summing unit

Interconnections various Topologies Operations

Artificial Neural Networks

Historical Development of ANN

Historical Development of ANN [contd]

Single Layer FF Network

Output layer Input layer

Multilayer Feedforward Network

Output Input layer Hidden layers layer

Applications of Artificial Neural Networks

Artificial Intellect with Neural Networks

Intelligent Expert Systems

Major Areas of Usage

Pattern Recognition Tasks

Pattern Association (contd)

Pattern Recognition Problem

Pattern Association Problem

Pattern Association Problem (contd)

Pattern Association Problem (contd)

Basic Functional Units

Simplest FB NN has N units, each connected to all others & to itself

Types of ANN & their suitable tasks

A 1000001 B 1000010 . . . Z 1011010

Pattern Mapping Action

FB NN Pattern Association (contd)

FB NN Pattern Storage (LTM)

FB NN Pattern Storage (contd)

FB NN Pattern Environment Storage

CL NN Pattern Storage (STM)

CL NN Pattern Clustering (contd)

Types of Learning (contd)

Memory-based Learning (contd)

Summary of Learning Laws

Weight Update (Wi) Formula (for j = 1, 2, ..., M) siaj (bi - si) aj

Bipolar Output Functions

(bi - si) f(xi) aj

Random, but normalised

Competitive Learning; K is the Winning Unit

End of Units I & II

Você também pode gostar

Unit-I & II Slides

Unit-I & II Slides