Artificial Neural Networks

Artificial Neural Networks
1
Why Neural Network (ANN)?
- Nonlinearity
- Input/Output Mapping
- Adaptivity
- Contextual information (local-global interactions)
- Massively parallel architecture
- Neurobiological analogy
Difficulties?
- ‘Big’ ground truth
- Sensitive hyper-parameters
2
3
Structural organization
Brain & NN
Neural network → Network of neurons
A neuron is an information-processing unit
McCulloch–Pitts model
Lowering (-)or increasing (+) the net input

wkj for synapse j
to neuron k, can
be + or - .
For limiting the

amplitude of the
output (saturation) →
[-A B] (closed interval)
4
Mathematically:
Affine (translation) transformation
Activation potential
1 at j=0
5
Nonlinear → due to activation
function
Activation function types:

Threshold function-
6
Yes or No
all-or-none property:
McCulloch–Pitts
Sigmoid function (S-shaped) -
• Elegant (continuous) & no decision

• Allows two-sided saturation
Logistic function is the

most common example.
a → slope (positive value)

7 Output normalized to Max 1 and Min 0, [0 1]
Signum function -
[-1 1]
Hyperbolic tangent function (S) -
8
Probabilistic /Stochastic Neuron:
McCulloch–Pitts
Neuron state:
firing (vs) not-firing
(output y)
T – Denotes the noise level

T → 0, deterministic
• Allows noise effecting neuron firing
9
Graphs representing network of neurons
Signal flow graphs – – directed graphs
Directed synaptic links
Directed activation links
Node outputs are

summation of node inputs
Transfer function of a link

10
Network Architectures
Single layer feed forward network
Input Layer node
Output Layer node.

Output layer is the single
layer, containing
computational nodes
Feedforward layered network
11
Multi layer feed forward network
One or more hidden layers
Fully connected
Hidden Layer
computational node
Extracting higher order
representation?
Input is called
activation pattern
Based on number of nodes

network
12
Recurrent network At least 1 feedback
No self-feedback loops
No Hidden Neurons
Self-feedback loops
Hidden Neurons
13
Assuming operation on the unit circle of in z plane,
usually considered of interest as FT is defined.
𝜔𝜔 < 1 Output Convergent, System Stable
𝜔𝜔 ≥ 1 Output Divergent (linearly or

exponentially)
System not stable.
Further on 𝜔𝜔 < 1
14
as 𝜔𝜔𝑁𝑁 ≪ 1
• Without this the system has infinite memory, but the

truncation is as good
• Memory fading is modeled!
Analysis is much more complicated in an RNN, as we have the

nonlinear processing unit!
15
Knowledge Representation
Knowledge – stored information
It is goal directed!
• Possible forms of information representation
from the inputs by the internal network
parameters are highly diverse.
Hence, difficult!
Appropriate learning of the underlying environment
Two parts:
• Prior information
• Observation / Measurements using Sensors
16
Examples to train the network
Examples: Labeled or Unlabeled
Input signal paired with desired response
Training data / samples Costly, need a teacher!
1. Learning / Training on sufficient (converge) inputs

appropriately presented
2. Testing on unseen data (inputs)
k-fold Cross-Validation
17
• Learning can be through positive or negative examples
• Knowledge represented by free parameters like synaptic
weights (and biases)
The subject of how knowledge is actually represented inside an
artificial network is, however, very complicated.
Correlation !
How?
18
Embedding prior information
Ad-hoc!
• Restricting network architecture, allowing only local

connections (receptive field)
• Constraining the choice of synaptic weights (weight sharing)
Free parameter is (drastically) reduced, less training required!
Partially connected
feedforward network,
with same set of weights
for each neuron in a layer.
19
Receptive field of a neuron:
The region of the input (field) that influences the output
Weight sharing in the example:
Like convolution sum, rather correlation.
Was called convolution network (ConvNet)
Evolve into feature learning part of CNN
20
Embedding invariances
From image perspective:
- Rotation invariant (recognition)
- Scale invariant (recognition)
Loudness invariant (word recognition from voice)
Invariance by structure
Restrictions on the free parameters through subsidiary computations.
*VERY DIFFICULT!
Invariance by training
While training, a number of examples of the same entity with difference
in the attribute/s whose invariance is/are required.
*Computation more & how does it effect classification capability?
Invariant feature space
Inputs are feature values extract from the signal, which are invariant to
the req. signal transformation. But how to know, what feature?
21
Learning Process
Learning with a teacher
Supervised Learning
Teacher’s knowledge: Input – Output examples
Error correction learning:

Minimizing mean/sum of square
errors.
Gradient decent on error surfaces?!
Free parameters converge (Long term memory) after

sufficient examples, then NN is left on its own
22
Learning without a teacher
Reinforcement Learning
Minimization of a scalar index of performance
Cumulative cost of actions
Delay here
23
Unsupervised Learning
Self-organized!
• Task independent measure of quality used

• In an NN, something like “winner-takes all” strategy can
be employed to trigger a neuron in a layer.
Supervised is the popular one: Statistical Learning Theory

24

Artificial Neural Networks

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Artificial Neural Networks

Enviado por

Direitos autorais:

Formatos disponíveis

Artificial Neural Networks

Lowering (-)or increasing (+) the net input

For limiting the

Affine (translation) transformation

Activation function types:

Sigmoid function (S-shaped) -

• Elegant (continuous) & no decision

Logistic function is the

a → slope (positive value)

T – Denotes the noise level

• Allows noise effecting neuron firing

Signal flow graphs – – directed graphs

Directed synaptic links

Directed activation links

Node outputs are

Transfer function of a link

Single layer feed forward network

Input Layer node

Output Layer node.

Feedforward layered network

Based on number of nodes

𝜔𝜔 < 1 Output Convergent, System Stable

𝜔𝜔 ≥ 1 Output Divergent (linearly or

• Without this the system has infinite memory, but the

Analysis is much more complicated in an RNN, as we have the

Appropriate learning of the underlying environment

Input signal paired with desired response

Training data / samples Costly, need a teacher!

1. Learning / Training on sufficient (converge) inputs

• Restricting network architecture, allowing only local

Weight sharing in the example:

Like convolution sum, rather correlation.

Was called convolution network (ConvNet)

Evolve into feature learning part of CNN

Error correction learning:

Free parameters converge (Long term memory) after

Cumulative cost of actions

• Task independent measure of quality used

Supervised is the popular one: Statistical Learning Theory

Você também pode gostar