Some Applications of Bayesian Networks

Some applications of Bayesian networks
Jiřı́ Vomlel
Institute of Information Theory and Automation

Academy of Sciences of the Czech Republic
This presentation is available at

http://www.utia.cas.cz/vomlel/
1
Contents
• Brief introduction to Bayesian networks
• Typical tasks that can be solved using Bayesian networks
• 1: Medical diagnosis (a very simple example)
• 2: Decision making maximizing expected utility (another simple

example)
• 3: Adaptive testing (a case study)
• 4: Decision-theoretic troubleshooting (a commercial product)
2
Bayesian network
• a directed acyclic graph G = (V, E)
• each node i ∈ V corresponds to a random variable Xi with a

finite set Xi of mutually exclusive states
• pa(i ) denotes the set of parents of node i in graph G
• to each node i ∈ V corresponds a conditional probability table

P( Xi | ( X j ) j∈ pa(i) )
• the DAG implies conditional independence relations between

( Xi )i∈V
• d-separation (Pearl, 1986) can be used to read the CI relations

from the DAG
3
Using the chain rule we have that:
P(( Xi )i∈V ) = ∏ P(Xi | Xi−1 , . . . , X1 )

i ∈V
Assume an ordering of Xi , i ∈ V such that if j ∈ pa(i ) then j < i.

From the DAG we can read conditional independence relations
Xi ⊥⊥ Xk | ( X j ) j∈ pa(i) for i ∈ V and k < i and k 6∈ pa(i )
Using the conditional independence relations from the DAG we get
P(( Xi )i∈V ) = ∏ P(Xi | (X j ) j∈ pa(i) ) .

i ∈V
It is the joint probability distribution represented by the Bayesian

network.
4
Example:
X1 P( X1 ) X2 P( X2 )
P( X3 | X1 )
X3 X4 P( X4 | X2 )
X5 P( X5 | X1 )
X6 P( X6 | X3 , X4 )
X7 X8 X9 P( X9 | X6 )
P( X7 | X5 ) P( X8 | X7 , X6 )
P( X1 , . . . , X9 ) =
= P( X9 | X8 , . . . , X1 ) · P( X8 | X7 , . . . , X1 ) · . . . · P( X2 | X1 ) · P( X1 )
= P( X9 | X6 ) · P( X8 | X7 , X6 ) · P( X7 | X5 ) · P( X6 | X4 , X3 )
· P( X5 | X1 ) · P( X4 | X2 ) · P( X3 | X1 ) · P( X2 ) · P( X1 )
5
Typical use of Bayesian networks
• to model and explain a domain.
• to update beliefs about states of certain variables when some

other variables were observed, i.e., computing conditional
probability distributions, e.g., P( X23 | X17 = yes, X54 = no).
• to find most probable configurations of variables
• to support decision making under uncertainty
• to find good strategies for solving tasks in a domain with

uncertainty.
6
Simplified diagnostic example
We have a patient.
Possible diagnoses: tuberculosis, lung cancer, bronchitis.
7
We don’t know anything about the pa- Patient is a smoker.
tient
8
Patient is a smoker. ... and he complains about dyspnoea
9
Patient is a smoker and complains ... and his X-ray is positive
about dyspnoea
10
Patient is a smoker and complains ... and he visited Asia recently
about dyspnoea and his X-ray is pos-
itive
11
Application 2:Decision making
The goal: maximize expected utility
Hugin example: mildew4.net
12
Fixed and Adaptive Test Strategies
Q1
Q2
Q5
wrong correct
Q3
Q4 Q8
Q4
wrong correct wrong correct
Q5
Q2 Q7 Q6 Q9
Q6 wrong correct wrong correct wrong correct wrong correct
Q7 Q1 Q3 Q6 Q8 Q4 Q7 Q7 Q10
Q8
Q9
Q10
13
X2
For all nodes n of a strategy s we
have defined:
X3
X1 • evidence en , i.e. outcomes of
X2
steps performed to get to node
X3
n,
X1
• probability P(en ) of getting to
X3
X2 node n, and
X1
• utility f (en ) being a real num-
X3
ber.
X1
Let L(s) be the set of terminal
X2
X3 nodes of strategy s.
X1
Expected utility of strategy is
X2
E f (s) = ∑`∈L(s) P(e` ) · f (e` ).
14
X2
X3
X1
X2
X3
Strategy s? is optimal iff it maxi-
mizes its expected utility.
X1
X3 Strategy s is myopically optimal iff

X2
X1
each step of strategy s is selected
X3
so that it maximizes expected utility
after the selected step is performed
X1
(one step look ahead).
X2
X3
X1
X2
15
Application 3: Adaptive test of basic
operations with fractions
Examples of tasks:
3 5 1 15 1 5 1 4 1

T1 : 4 · 6 − 8 = 24 − 8 = 8 − 8 = 8 = 2
1 1 2 1 3 1
T2 : 6 + 12 = 12 + 12 = 12 = 4
1
T3 : 4 · 1 21 = 1
4 · 3
2 = 3
8
1 1 1 1 1 2 2 1

T4 : 2 · 2 · 3 + 3 = 4 · 3 = 12 = 6 .
16
Elementary and operational skills
1
CP Comparison (common nu- 2 > 13 , 2
3 > 1
3
merator or denominator)
1 2 1+2 3
AD Addition (comm. denom.) 7 + 7 = 7 = 7
2 1 2−1 1
SB Subtract. (comm. denom.) 5 − 5 = 5 = 5
1 3 3
MT Multiplication 2 · 5 = 10

1 2 3 4
CD Common denominator 2, 3 = 6, 6
4 2·2 2
CL Cancelling out 6 = 2·3 = 3
7 3·2+1
CIM Conv. to mixed numbers 2 = 2 = 3 12
CMI Conv. to improp. fractions 3 12 = 3·22+1 = 7

2
17
Misconceptions
Label Description Occurrence
a c a+c
MAD b + d = b+d 14.8%
MSB a
− = ba−
c −c 9.4%
b d d
MMT1 a
b · bc = ab·c 14.1%
MMT2 a
b · bc = ab+·bc 8.1%
MMT3 a
b · dc = ab··dc 15.4%
MMT4 a
b · dc = ba+·cd 8.1%
MC a bc = ac·b 4.0%
18
Student model
HV2 HV1
ACMI ACIM ACL ACD
AD SB CMI CIM CL CD MT
CP
MAD MSB MC MMT1 MMT2 MMT3 MMT4
19
Evidence model for task T1

3 5 1 15 1 5 1 4 1
· − = − = − = =
4 6 8 24 8 8 8 8 2
T1 ⇔ MT & CL & ACL & SB & ¬ MMT3 & ¬ MMT4 & ¬ MSB
CL ACL MT
SB MMT4
MSB T1 MMT3
P(X1 | T1)
X1
Hugin: model-hv-2.net
20
Using information gain as the utility function
“The lower the entropy of a probability distribution the more we know.”
H ( P(X)) = − ∑ P(X = x) · log P(X = x)

x
entropy
0.5
0
0 0.5 1
probability
Information gain in a node n of a strategy
IG (en ) = H ( P(S)) − H ( P(S | en ))
21
Skill Prediction Quality
92
adaptive
average
90 descending
ascending
88
Quality of skill predictions
86
84
82
80
78
76
74
0 2 4 6 8 10 12 14 16 18 20
Number of answered questions
22
Application 4: Troubleshooting
Dezide Advisor customized to a specific portal, seen from the user’s

perspective through a web browser.
23
Application 2: Troubleshooting - Light print problem
Actions
Faults
A1
F1
A2
Problem F2
F A3
F3
Questions
F4 Q1
• Problems: F1 Distribution problem, F2 Defective toner, F3

Corrupted dataflow, and F4 Wrong driver setting.
• Actions: A1 Remove, shake and reseat toner, A2 Try another
toner, and A3 Cycle power.
• Questions: Q1 Is the configuration page printed light?
24
Troubleshooting strategy
A1 = yes A2 = yes
A1 = no A2 = no
Q1 = no A1 A2
Q1
A2 = no A1 = no
Q1 = yes A2 A1
A2 = yes A1 = yes
The task is to find a strategy s ∈ S minimising expected cost of repair
ECR (s) = ∑ P(e` ) · ( t(e` ) + c(e` ) ) .

`∈L(s)
25
Expected cost of repair for a given strategy
A1 = yes A2 = yes ECR (s) =

A1 = no A2 = no
P( Q1 = no, A1 = yes) · cQ1 + c A1
Q1 = no A1 A2
+ P( Q1 = no, A1 = no, A2 = yes) · cQ1 + c A1 + c A2

Q1 + P( Q1 = no, A1 = no, A2 = no) · cQ1 + c A1 + c A2 + cCS

Q1 = yes A2
A2 = no
A1
A1 = no + P( Q1 = yes, A2 = yes) · cQ1 + c A2

+ P( Q1 = yes, A2 = no, A1 = yes) · cQ1 + c A2 + c A1
A2 = yes A1 = yes
+ P( Q1 = yes, A2 = no, A1 = no) · cQ1 + c A2 + c A1 + cCS
Demo: www.dezide.com Products/Demo/‘‘Try out expert mode’’
26
Commercial applications of Bayesian networks in
educational testing and troubleshooting
• Hugin Expert A/S.
software product: Hugin - a Bayesian network tool.
http://www.hugin.com/
• Educational Testing Service (ETS)
the world’s largest private educational testing organization
Research unit doing research on adaptive tests using Bayesian
networks: http://www.ets.org/research/
• SACSO Project
Systems for Automatic Customer Support Operations
- research project of Hewlett Packard and Aalborg University.
The troubleshooter offered as DezisionWorks by Dezide Ltd.
http://www.dezide.com/
27

Some Applications of Bayesian Networks

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Some Applications of Bayesian Networks

Enviado por

Direitos autorais:

Formatos disponíveis

Some applications of Bayesian networks

Institute of Information Theory and Automation

This presentation is available at

• Typical tasks that can be solved using Bayesian networks

• 1: Medical diagnosis (a very simple example)

• 2: Decision making maximizing expected utility (another simple

• 3: Adaptive testing (a case study)

• 4: Decision-theoretic troubleshooting (a commercial product)

• each node i ∈ V corresponds to a random variable Xi with a

• pa(i ) denotes the set of parents of node i in graph G

• to each node i ∈ V corresponds a conditional probability table

• the DAG implies conditional independence relations between

• d-separation (Pearl, 1986) can be used to read the CI relations

P(( Xi )i∈V ) = ∏ P(Xi | Xi−1 , . . . , X1 )

Assume an ordering of Xi , i ∈ V such that if j ∈ pa(i ) then j < i.

Xi ⊥⊥ Xk | ( X j ) j∈ pa(i) for i ∈ V and k < i and k 6∈ pa(i )

Using the conditional independence relations from the DAG we get

P(( Xi )i∈V ) = ∏ P(Xi | (X j ) j∈ pa(i) ) .

It is the joint probability distribution represented by the Bayesian

• to update beliefs about states of certain variables when some

• to find most probable configurations of variables

• to support decision making under uncertainty

• to find good strategies for solving tasks in a domain with

Hugin example: mildew4.net

X3 Strategy s is myopically optimal iff

CMI Conv. to improp. fractions 3 12 = 3·22+1 = 7

ACMI ACIM ACL ACD

MAD MSB MC MMT1 MMT2 MMT3 MMT4

H ( P(X)) = − ∑ P(X = x) · log P(X = x)

Information gain in a node n of a strategy

IG (en ) = H ( P(S)) − H ( P(S | en ))

Dezide Advisor customized to a specific portal, seen from the user’s

• Problems: F1 Distribution problem, F2 Defective toner, F3

The task is to find a strategy s ∈ S minimising expected cost of repair

ECR (s) = ∑ P(e` ) · ( t(e` ) + c(e` ) ) .

A1 = yes A2 = yes ECR (s) =

Demo: www.dezide.com Products/Demo/‘‘Try out expert mode’’

Você também pode gostar