Escolar Documentos
Profissional Documentos
Cultura Documentos
Decision boundary
Feature 2, x2
Feature 2, x2
Decision boundary
Feature 1, x1 Feature 1, x1
(c) Alexander Ihler
Perceptron Classifier (2 features)
T(f)
f(X,Y)
x1 w1
x2 Classifier
w2 T(f)
f = w1 X1 + w2 X2 + w0
f(X,Y)
x1 w1
x2 Classifier
w2 T(f)
f = w1 X1 + w2 X2 + w0
w0 Threshold {0, 1}
1 weighted sum Function
of the inputs output
= class
decision
T(f) = 0 if f < 0
1D example:
T(f) = 1 if f > 0
F3
(c) Alexander Ihler
Multi-layer perceptron model
Step functions are just perceptrons!
Features are outputs of a perceptron
Combination of features output of another
F1
w11
w1 Linear func+on of features:
w10 a F1 + b F2 + c F3 + d
x1 w21 F2 Out
w2
w20 Ex: F1 F2 + F3
w3
1 w31 F3
F1
w11
w1 Linear func+on of features:
w10 a F1 + b F2 + c F3 + d
x1 w21 F2 Out
w2
w20 Ex: F1 F2 + F3
w3
1 w31 F3
Perceptron:
Step function /
Linear partition
Input
Features
2-layer:
Features are now partitions
All linear combinations of those partitions
Input Layer 1
Features
3-layer:
Features are now complex functions
Output any linear combination of those
Current research:
Deep architectures
(many layers)
Input
Features
Layer 1 Layer 2 Layer 3
Output y
Layer 1 h1 h2 h3 v0
v1
h1
Input
Features
x0 x1 h2
(c) Alexander Ihler
Neural networks
Another term for MLPs w1
Biological motivation w2
w3
Neurons
Simple cells
Dendrites sense charge
Cell weighs inputs
Fires axon
Logis+c
Hyperbolic
Tangent
Gaussian
Linear
% ...
Information
Alternative: recurrent NNs
(c) Alexander Ihler
Feed-forward networks
A note on multiple outputs:
Regression:
Predict multi-dimensional y
Shared representation
= fewer parameters
Classification
Predict binary vector
Multi-class classification
y = 2 = [0 0 1 0 ]
Multiple, joint binary predictions
(image tagging, etc.)
Hidden Layer
Single layer
Logistic sigmoid function Inputs Outputs
Smooth, differentiable
Optimize using:
Batch gradient descent
Stochastic gradient descent
hj
hj
% X : (1xN1)
H = Sig(X1.dot(W[0]))
% W1 : (N2 x N1+1)
% H : (1xN2)
Yh = Sig(H1.dot(W[1]))
% W2 : (N3 x N2+1)
B2 = (Y-Yhat) * dSig(S) #(1xN3) % Yh : (1xN3)
G2 = B2.T.dot( H ) #(N3x1)*(1xN2)=(N3xN2)
B1 = B2.dot(W[1])*dSig(T)#(1xN3).(N3*N2)*(1xN2)
Data:
+
learned predic/on fn:
h3
h2
h1
x
h3
h2
h1
x
Fix output,
simulate inputs
PyLearn2
https://github.com/lisa-lab/pylearn2
TensorFlow