8 visualizações

Enviado por Mészáros Balázs

ANN slides

- Retinal Desease
- Approximation Solution of Fractional Partial Differential Equations
- anfis.rpi04
- COMPARISON OF SPATIAL INTERPOLATION METHODS AND MULTI-LAYER NEURAL NETWORKS FOR DIFFERENT POINT DISTRIBUTIONS ON A DIGITAL ELEVATION MODEL
- Competitive and Cooperative Behavior in Bio-Inspired AI
- 10.1.1.51.6568_90
- signature recognition using neural network .pdf
- Intelligent Agent System for Human-Robot Interaction Through Artificial Emotion
- nn35
- Breast Cancer1
- 42994-vcon10
- Christie_1_2006_Uncertainty Quantification for Porous Media Flows
- Package Description_neuralnet.pdf
- MLP & Backpropagation
- Neuro-Genetic Optimization of LDO-fired Rotary Furnace Parameters for the Production of Quality Castings
- Pid 3851843
- Data Mining
- Deep Learning 17 April 2014
- Implementation of Online Tuning Gain Scheduling Nonlinear PID Controller Using Neural Network
- Clustering Textures with EHG Algorithm for Modelling Video

Você está na página 1de 7

Background Artificial neurons, what they can and cannot do The multilayer perceptron (MLP) Three forms of learning The back propagation algorithm Radial basis function networks Competitive learning (and relatives)

1

An artificial neuron

x0 = +1 x1 x2 xn w2 wn w1 w0 =

y = f (S )

S=

f (S ) = 1 1 + e S

2

wi xi =

wi xi

i =0

i =1

The neuron can be used as a classifier y<0.5 class 0 y>0.5 class 1 Linear discriminant = a hyper plane x2 2D example: A line

x2 =

Not linearly separable must combine two linear discriminants.

x2 1

w1 x1 + w2 w2

Only linearly separable classification problems can be solved. x1

3

0 0 1 NOR

x1 AND

Two sigmoids implement fuzzy AND and NOR 4

store information in the weights, not in the nodes are trained, by adjusting the weights, not programmed can generalize to previously unseen data are adaptive are fast computational devices, well suited for parallel simulation and/or hardware implementation are fault tolerant

5 6

Inputs

Outputs

Can implement any function, given a sufficiently rich internal structure (number of nodes and layers)

Application areas

Finance Forecasting Fraud detection Medicine Image analysis Consumer market Household equipment Character recognition Speech recognition Industry Adaptive control Signal analysis Data mining

(statistical methods are always at least as good, right?)

Neural networks are statistical methods Model independence Adaptivity/Flexibility Concurrency Economical reasons (rapid prototyping)

Supervised

Input Target function State Learning system Error Learning system

Suggested actions

Back propagation

Input Output (y) Desired output (d) Error function Action

Unsupervised

Reinforcement

Environment Reward Agent

Action selector

E w ji

The weight should be moved in proportion to that contribution, but in the other direction:

w ji =

9

E w ji

10

i

Network is initialised with small random weights Split data in two a training set and a test set The training set is used for training and is passed through many times. Update weights after each presentation (pattern learning) or Accumulate weight changes (w) until the end of the training set is reached (epoch or batch learning) The test set is used to test for generalization (to see how well the net does on previously unseen data). This is the result the counts!

wji

Assumptions

Error is squared error:

w ji =

E=

1 2

n j =1

j k E = j xi w ji derivative of error

(d j y j )2

j = y (1 y ) j j

y j = f (S j ) =

1 1+ e

S j

derivative of sigmoid

sum over all nodes in the next layer (closer to the outputs)

11

12

Overtraining

E Typical error curves Test or validation set error

Network size

Overtraining is more likely to occur

if we train on too little data if the network has too many hidden nodes if we train for too long

Training set error Time (epochs) Overtraining Cross validation: Use a third set, a validation set, to decide when to stop (find the minimum for this set, and retrain for that number of epochs)

13

The network should be slightly larger than the size necessary to represent the target function Unfortunately, the target function is unknown ... Need much more training data than the number of weights!

14

1. Start with a small network, train, increase the size, train again, etc., until the error on the training set can be reduced to acceptable levels. 2. If an acceptable error level was found, increase the size by a few percent and retrain again, this time using the cross-validation procedure to decide when to stop. Publish the result on the independent test set. 3. If the network failed to reduce the error on the training set, despite a large number of nodes and attempts, something is likely to be wrong with the data.

15

Practical considerations

What happens if the mapping represented by the data is not a function? For example, what if the same input does not always lead to the same output? In what order should data be presented? Sequentially? At random? How should data be represented? Compact? Distributed? What can be done about missing data? Trick of the trade: Monotonic functions are easier to learn than non-monotonic functions! (at least for the MLP)

16

Layered structure, like the MLP, with one hidden layer Output nodes are conventional Each hidden node

Geometric interpretation

The input space is covered with overlapping Gaussians.

measures the distance between its weight vector and the input vector (instead of a weighted sum)

Inputs

Outputs

17 18

RBF training

Could use backprop (transfer function still differentiable) Better: Train layers separately

RBF (hidden) nodes work in a local region, MLP nodes are global MLPs do better in high-dimensional spaces MLPs require fewer nodes and generalizes better RBFs can learn faster RBFs are less sensitive to the order in which data is presented RBFs make less false-yes classification errors MLPs extrapolate better

20

Hidden layer: Find position and size of Gaussians by unsupervised learning (e.g. competitive learning, K-means) Output layer: Supervised, e.g. Delta-rule, LMS, backprop

19

Unsupervised learning

Classifying unlabeled data Nearest neighbour classifiers Classify the unknown sample (vector) x to the class of its closest previously classified neighbour

The new pattern, x, will be classified as a . x

K-means

K-means, for K=2 Make a codebook of two vectors, c1 and c2 Sample (at random) two vectors from the data as initial values of c1 and c2 Split the data in two subsets, D1 and D2 where D1 is the set of all points with c1 as their closest codebook vector, and vice versa Move c1 towards the mean in D1 and c2 towards the mean in D2 Repeat from 3 until convergence (until the codebook vectors stop moving)

Problem 1: The closest neighbour may be an outlier from the wrong class Problem 2: Must store lots of samples and compute distance to each one, for every new sample

21

Voroni regions

K-means form so called Voroni regions in the input space The Voroni region around a codebook vector ci is the region in which ci is the closest codebook vector

1. 2. 3.

Competitive learning

M linear, threshold less, nodes (only weighted sums) N inputs

Present a pattern (sample), x The node with the largest output (node k) is declared winner The weights of the winner is updated so that it will become even stronger the next time the same pattern is presented. All other weights are left unchanged With normalised weights, this is equivalent to finding the node with the minimum distance between its weight vector and the input vector Network node = Codebook vector

wki = ( xi wki)

Voroni regions around 10 codebook vectors

23

1 i N

24

Problem with competitive learning: A node may become invincible A B

The cerebral cortex is a twodimensional structure, yet we can reason in more than two dimensions Different neurons in the auditory cortex respond to different frequencies. These neurons are located in frequency order! Topological preservation / topographic map

W

Poor initialisation: The weight vectors have been initialised to small random numbers (in W), but these are far from the data (A and B) The first node to win will move from W towards A or B and will always win, henceforth Solution: Use the data to initialise the weights (as in K-means), or include the winning-frequency in the distance measure, or move more nodes than only 25 the winner.

Dimensional reduction

Non-linear, topologically preserving, dimensional reduction (like pressing a flower)

26

SOM

Competitive learning, extended in two ways: 1. The nodes are organised in a two-dimensional grid

(in competitive learning, there is no defined order between nodes) A 3x3 grid, making a twodimensional map of the fourdimensional input space

Find the winner, node k, and then update all weights by:

wki = f ( j , k )( xi wki )

1 i N

(not only the winner is updated, but also its closest neighbours in the grid)

f(j, k) is a neighbourhood function in the range [0,1], with a maximum for the winner ( j=k) and decreasing with distance from the winner, e.g. a Gaussian Gradually decrease neighbourhood radius (width of the Gaussians) and learning rate () over time. Result: Vectors that are close in the high-dimensional input space will activate areas that are close on the grid.

28

27

A 10x10 SOM, is trained on a chemical analysis of 178 wines from one region in Italy, where the grapes have grown on three different types of soil. The input is 13-dimensional. After training, wines from different soil types activate different regions of the SOM. For example:

http://websom.hut.fi A two-dimensional, clickable, map of Usenet news articles (from comp.ai.neural-nets)

Note that the network is not told that the difference between the wines is the soil type, nor how many such types (how many classes) there are.

29

Growing unsupervised network (starting from two nodes) Dynamic neighbourhood Constant parameters Very good at following moving targets Can also follow jumping targets Current work: Using GNG to define and train the hidden layer of Gaussians in a RBF network

31

Node positions

Start with two nodes Each node has a set of neighbours, indicated by edges The edges are created and destroyed dynamically during training For each sample, the closest node, k, and all its current neighbours are moved towards the input

32

Node creation

A new node (blue) is created every th time step, unless the maximum number of nodes has been reached The new node is placed halfway between the node with the greatest error and the node among its current neighbours with the greatest error The node with the greatest error is the most unstable one

33

Here, a fourth node has just been created In effect, new nodes are created close to where they are most likely needed The exact position of the new node is not crucial, since nodes move around

34

After a while

Neighbourhood

Neighbourhood edges are created and destroyed as follows: For each sample, let k denote the winner (the node closest to the sample) and r the runner-up (the second closest) If an edge exists between k and r, reset its age to 0

Otherwise, create such an edge and set its age to 0

7 nodes

35

Increment the age of all other edges emanating from node k Edges older than amax are removed, as are any nodes that in this way loses its last remaining edge

36

Delaunay triangulation

Connect the codebook vectors in all adjacent Voroni regions

Dead units

There is only one way for an edge to get younger when the two nodes it interconnects are the two closest to the input If one of the two nodes wins, but the other one is not the runner-up, then, and only then, the edge ages If neither of the two nodes win, the edge does not age!

37

The input distribution has jumped from the lower left to the upper right corner

38

The lab

(in room 1515!) Classification of bitmaps, by supervised learning (back propagation), using the SNNS simulator An illustration of some unsupervised learning algorithms, using the GNG demo applet

LBG/LBG-U ( K-means) HCL (Hard competitive learning) Neural gas CHL (Competitive Hebbian learning) Neural gas with CHL GNG/GNG-U (Growing neural gas) SOM (Self organising map) Growing grid

39

- Retinal DeseaseEnviado porRam Kumar
- Approximation Solution of Fractional Partial Differential EquationsEnviado porAdel Almarashi
- anfis.rpi04Enviado porthe_ray004
- COMPARISON OF SPATIAL INTERPOLATION METHODS AND MULTI-LAYER NEURAL NETWORKS FOR DIFFERENT POINT DISTRIBUTIONS ON A DIGITAL ELEVATION MODELEnviado porcoolboy_usama
- Competitive and Cooperative Behavior in Bio-Inspired AIEnviado porOlav Bjørkøy
- 10.1.1.51.6568_90Enviado porNikita Juyal
- signature recognition using neural network .pdfEnviado porBhupendra Chaudhari
- Intelligent Agent System for Human-Robot Interaction Through Artificial EmotionEnviado porlupustar
- nn35Enviado porvenkatesh_pabolu
- Breast Cancer1Enviado porabhishek
- 42994-vcon10Enviado porKishor Peddi
- Christie_1_2006_Uncertainty Quantification for Porous Media FlowsEnviado porchrisofoma
- Package Description_neuralnet.pdfEnviado porJessica Aguilar
- MLP & BackpropagationEnviado porali
- Neuro-Genetic Optimization of LDO-fired Rotary Furnace Parameters for the Production of Quality CastingsEnviado porIJERD
- Pid 3851843Enviado porNikola Mijailović
- Data MiningEnviado porHarshul Kumar
- Deep Learning 17 April 2014Enviado porprvtbaldrick
- Implementation of Online Tuning Gain Scheduling Nonlinear PID Controller Using Neural NetworkEnviado porpvdai
- Clustering Textures with EHG Algorithm for Modelling VideoEnviado porInternational Journal of Computer Science and Engineering Communications
- ANN for QbD, Roller CompactionEnviado porngriffin0521
- Continuous Learning of Human Activity Models Using Deep NetsEnviado porAntun Tun
- Gradient Descent TutorialEnviado porAntonio
- 1502.04156Enviado porAntoni Torrents Odin
- Forex Case StudyEnviado porSri Ram Sahoo
- Neural Networks Using C# SuccinctlyEnviado porJonathan Lazo Irus
- 598819 80 3D Geological Modeling Talavera FinalEnviado porsomebody404
- High-Performance Extreme Learning Machines- A Complete Toolbox for Big Data Applications.pdfEnviado porAbílio Marcos
- 12-BPEnviado poranurag
- Applying Data Mining Techniques Using SAS Enterprise MinerEnviado porsudheerhegade

- Blues Piano RiffsEnviado porEric White
- Piano Funk Groove in FEnviado porRuben van Harmelen
- 20120926 1200 L Kiss Anna Hun Dorsomedial and VentrolateralEnviado porMészáros Balázs
- TimeZonesEnviado porMészáros Balázs
- TimeZonesEnviado porMészáros Balázs
- Tenyleges Tulajdonosi Nyilatkozat Jogi SzemelyeknekEnviado porMészáros Balázs
- Funkcionális Anatómia III.Enviado porMajorosNikolett
- Microsoft Portable Execution and Common Object FIle Format SpecificationEnviado porJeff Pratt
- Fuzzy_logika_segedletEnviado porMészáros Balázs
- Stevie Wonder - My Cherie Amour Music SheetEnviado porIsrael Ferto
- Preflop Texas Hold`em short stackEnviado porMészáros Balázs
- Máté Péter dalok - (4db kiszedett)Enviado porGál Zoltán
- ELF FormatEnviado porapi-3721578

- R arules packageEnviado poralvarosolana
- Learning Modern AlgebraEnviado porfalladoglenn
- number theory.pdfEnviado porPham Thi Thanh Thuy
- Graphing Rational FunctionsEnviado porRaganda Silalahi
- Ratio and Proportion_3Enviado porahee_1
- Unit 9 ContinuityEnviado porChandradeep Reddy Teegala
- Lienard EquationEnviado pormenguemengue
- HuffmanEnviado portejas
- Georgiou - Complex Domain BackpropagationEnviado porPiyush Singh
- LogicEnviado porSumeet Kalsi
- Chapter 2Enviado porAalap Joe
- Wiberg-StateSpaceLinearSystems_text.pdfEnviado porVishnu SreeKumar
- ODD10Enviado porarquisaac
- Unit 2 Chapter 12 AnswersEnviado porashtigosine
- Signals %26 systems EC403Enviado porShashank M Chanmal
- Methods of ProofsEnviado porHanny Noerfha
- rational vs irrational notesEnviado porapi-332185413
- Chapter 11 Lines and Planes in 3 DimensionsEnviado porbjkhaw75
- Vector SpacesEnviado porMuhd Faisal Samsudin
- Lesson Plan M3Enviado porksati311
- Assignment 1Enviado pordivyanshuvarma
- Ch10 Discrete Mathematics Direct GraphEnviado porIbrahim Khleifat
- Split and Non Split Neighborhood Connected Domination in Fuzzy GraphsEnviado porIRJET Journal
- Mathematics-Question-Bank-Class-x-for-Summative-Assessment-II-2014.pdfEnviado poranislinek15
- Joachims ECML 98Enviado porMauricio Martis
- report in concepts in geometry.docxEnviado porAirah Jane Caspillo Jocson
- Edexcel a Level Mathematics Pure Mathematics Year 1Enviado por刘奇
- Maxima n MinimaEnviado porAnkur Goel
- InterpolateEnviado porShahnaz Gazal
- Hash Function Using Chaotic MapsEnviado porAinuddin Faizan