Você está na página 1de 24

Artificial Neural Networks

1
Why Neural Network (ANN)?

- Nonlinearity
- Input/Output Mapping
- Adaptivity
- Contextual information (local-global interactions)
- Massively parallel architecture
- Neurobiological analogy

Difficulties?
- ‘Big’ ground truth
- Sensitive hyper-parameters

2
3
Structural organization
Brain & NN
Neural network → Network of neurons
A neuron is an information-processing unit

McCulloch–Pitts model

Lowering (-)or increasing (+) the net input


wkj for synapse j
to neuron k, can
be + or - .

For limiting the


amplitude of the
output (saturation) →
[-A B] (closed interval)

4
Mathematically:

Affine (translation) transformation

Activation potential

1 at j=0

5
Nonlinear → due to activation
function

Activation function types:


Threshold function-

6
Yes or No

all-or-none property:
McCulloch–Pitts

Sigmoid function (S-shaped) -

• Elegant (continuous) & no decision


• Allows two-sided saturation

Logistic function is the


most common example.

a → slope (positive value)


7 Output normalized to Max 1 and Min 0, [0 1]
Signum function -

[-1 1]
Hyperbolic tangent function (S) -

8
Probabilistic /Stochastic Neuron:

McCulloch–Pitts
Neuron state:
firing (vs) not-firing
(output y)

T – Denotes the noise level


T → 0, deterministic

• Allows noise effecting neuron firing

9
Graphs representing network of neurons

Signal flow graphs – – directed graphs

Directed synaptic links

Directed activation links

Node outputs are


summation of node inputs

Transfer function of a link


10
Network Architectures

Single layer feed forward network

Input Layer node

Output Layer node.


Output layer is the single
layer, containing
computational nodes

Feedforward layered network

11
Multi layer feed forward network
One or more hidden layers

Fully connected

Hidden Layer
computational node
Extracting higher order
representation?
Input is called
activation pattern

Based on number of nodes


network

12
Recurrent network At least 1 feedback

No self-feedback loops
No Hidden Neurons

Self-feedback loops
Hidden Neurons

13
Assuming operation on the unit circle of in z plane,
usually considered of interest as FT is defined.

𝜔𝜔 < 1 Output Convergent, System Stable

𝜔𝜔 ≥ 1 Output Divergent (linearly or


exponentially)
System not stable.

Further on 𝜔𝜔 < 1
14
as 𝜔𝜔𝑁𝑁 ≪ 1

• Without this the system has infinite memory, but the


truncation is as good
• Memory fading is modeled!

Analysis is much more complicated in an RNN, as we have the


nonlinear processing unit!

15
Knowledge Representation
Knowledge – stored information
It is goal directed!
• Possible forms of information representation
from the inputs by the internal network
parameters are highly diverse.
Hence, difficult!

Appropriate learning of the underlying environment

Two parts:
• Prior information
• Observation / Measurements using Sensors

16
Examples to train the network
Examples: Labeled or Unlabeled

Input signal paired with desired response

Training data / samples Costly, need a teacher!

1. Learning / Training on sufficient (converge) inputs


appropriately presented
2. Testing on unseen data (inputs)

k-fold Cross-Validation

17
• Learning can be through positive or negative examples
• Knowledge represented by free parameters like synaptic
weights (and biases)
The subject of how knowledge is actually represented inside an
artificial network is, however, very complicated.
Correlation !

How?
18
Embedding prior information

Ad-hoc!

• Restricting network architecture, allowing only local


connections (receptive field)
• Constraining the choice of synaptic weights (weight sharing)
Free parameter is (drastically) reduced, less training required!

Partially connected
feedforward network,
with same set of weights
for each neuron in a layer.

19
Receptive field of a neuron:
The region of the input (field) that influences the output

Weight sharing in the example:

Like convolution sum, rather correlation.

Was called convolution network (ConvNet)

Evolve into feature learning part of CNN

20
Embedding invariances
From image perspective:
- Rotation invariant (recognition)
- Scale invariant (recognition)
Loudness invariant (word recognition from voice)
Invariance by structure
Restrictions on the free parameters through subsidiary computations.
*VERY DIFFICULT!
Invariance by training
While training, a number of examples of the same entity with difference
in the attribute/s whose invariance is/are required.
*Computation more & how does it effect classification capability?
Invariant feature space
Inputs are feature values extract from the signal, which are invariant to
the req. signal transformation. But how to know, what feature?
21
Learning Process
Learning with a teacher
Supervised Learning
Teacher’s knowledge: Input – Output examples

Error correction learning:


Minimizing mean/sum of square
errors.
Gradient decent on error surfaces?!

Free parameters converge (Long term memory) after


sufficient examples, then NN is left on its own
22
Learning without a teacher
Reinforcement Learning
Minimization of a scalar index of performance

Cumulative cost of actions

Delay here

23
Unsupervised Learning

Self-organized!

• Task independent measure of quality used


• In an NN, something like “winner-takes all” strategy can
be employed to trigger a neuron in a layer.

Supervised is the popular one: Statistical Learning Theory


24

Você também pode gostar