Escolar Documentos
Profissional Documentos
Cultura Documentos
Ke Chen
http://intranet.cs.man.ac.uk/mlo/comp20411
/
Modified and extended by Longin Jan Latecki
latecki@temple.edu
Outline
Background
Probability Basics
Probabilistic Classification
Nave Bayes
Example: Play Tennis
Relevant Issues
Conclusions
2
Background
There are three methods to establish a classifier
a) Model a classification rule directly
Examples: k-NN, decision trees, perceptron, SVM
Probability Basics
Prior, conditional and joint probability
P(X )
Prior probability:
Conditional probability:
P( X1 |X2 ), P(X2 |X1 )
Independence:
P( X2 |X1 ) P( X2 ), P( X1 |X2 ) P( X1 ), P(X1 ,X2 ) P( X1 )P( X2 )
Bayesian Rule
Likelihood Prior
P( X |C )P(C )
P(C |X )
Posterior
P( X )
Evidence
4
Probabilistic Classification
Establishing a probabilistic model for classification
Discriminative model
P(C |X ) C c1 , , c L , X (X1 , , Xn )
Generative model
P( X |C ) C c1 , , c L , X (X1 , , Xn )
P( X |C )P(C )
Apply Bayesian rule to convert:
P(C |X )
P( X |C )P(C )
P( X )
8
Feature Histograms
P(x)
C1
C2
Posterior Probability
P(C|x)
0
Slide by Stephen Marsland
Nave Bayes
Bayes classification
P(C |X ) P( X |C )P(C ) P( X1 , , Xn |C )P(C )
Difficulty: learning the joint probability
P( X1 , , Xn |C )
P( X1 |C )P( X2 , , Xn |C )
P( X1 |C )P( X2 |C ) P( Xn |C )
*
[ PMAP
( x1 |c *classification
) P( xn |c * )]P( crule
) [ P( x1 |c ) P( xn |c )]P(c ), c c * , c c1 , , c L
11
Nave Bayes
Nave Bayes Algorithm (for discrete input attributes)
X ( a1 , , an )
*
*
( a |c * up
( a |c * )]to
( cassign
( a the
( a |cc*
to
tables
label
X
if
[ PLook
)
P
P
)
[
P
|
c
)
P
)]
P
(
c
),
c
c
, c c1 , , c L
1
n
1
n
12
Example
Example: Play Tennis
13
Learning Phase
P(Outlook=o|Play=b)
Outlook
P(Temperature=t|Play=b)
Play=Yes Play=No
Temperature
Play=Yes
Play=No
Sunny
2/9
3/5
Hot
2/9
2/5
Overcast
4/9
0/5
Mild
4/9
2/5
Rain
3/9
2/5
Cool
3/9
1/5
P(Humidity=h|Play=b)
Humidity
Play=Yes Play=No
High
3/9
4/5
Normal
6/9
1/5
P(Play=Yes) = 9/14
P(Wind=w|Play=b)
Wind
Play=Yes
Play=No
Strong
3/9
3/5
Weak
6/9
2/5
P(Play=No) = 5/14
14
Example
Test Phase
P(Outlook=Sunny|Play=No) = 3/5
P(Wind=Strong|Play=Yes) = 3/9
P(Wind=Strong|Play=No) = 3/5
P(Play=No) = 5/14
MAP rule
P(Yes|x): [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) =
0.0053
P(No|x): [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206
Relevant Issues
Violation of Independence Assumption
If no example contains
attribute
( xvalue
P ( x1 |cthe
)
P
(
a
|
c
)
P
n |ci ) 0
i
jk i
In this circumstance,
during test
Homework
Redo the test on Slide 15 using the formula on
Slide 16 with m=1.
Compute P(Play=Yes|x) and P(Play=No|x) with
m=0 and with m=1 for
x=(Outlook=Overcast, Temperature=Cool, Humidity=High, Wind=Strong)
17
Relevant Issues
Continuous-valued Input Attributes
j
ji
exp
2
2 ji
2 ji
for X ( X1 , , Xn ), C c1 , , c L
LearningnPhase:
L
P(C ci ) i 1, , L
Output:
and
fornormal
X ( X1distributions
, , Xn )
Test Phase:
Calculate conditional probabilities with all the normal distributions
Apply the MAP rule to make a decision
18
Conclusions
Nave Bayes based on the independence assumption
19