Você está na página 1de 27

Some applications of Bayesian networks

Jiřı́ Vomlel

Institute of Information Theory and Automation


Academy of Sciences of the Czech Republic

This presentation is available at


http://www.utia.cas.cz/vomlel/

1
Contents
• Brief introduction to Bayesian networks

• Typical tasks that can be solved using Bayesian networks

• 1: Medical diagnosis (a very simple example)

• 2: Decision making maximizing expected utility (another simple


example)

• 3: Adaptive testing (a case study)

• 4: Decision-theoretic troubleshooting (a commercial product)

2
Bayesian network
• a directed acyclic graph G = (V, E)

• each node i ∈ V corresponds to a random variable Xi with a


finite set Xi of mutually exclusive states

• pa(i ) denotes the set of parents of node i in graph G

• to each node i ∈ V corresponds a conditional probability table


P( Xi | ( X j ) j∈ pa(i) )

• the DAG implies conditional independence relations between


( Xi )i∈V

• d-separation (Pearl, 1986) can be used to read the CI relations


from the DAG

3
Using the chain rule we have that:

P(( Xi )i∈V ) = ∏ P(Xi | Xi−1 , . . . , X1 )


i ∈V

Assume an ordering of Xi , i ∈ V such that if j ∈ pa(i ) then j < i.


From the DAG we can read conditional independence relations

Xi ⊥⊥ Xk | ( X j ) j∈ pa(i) for i ∈ V and k < i and k 6∈ pa(i )

Using the conditional independence relations from the DAG we get

P(( Xi )i∈V ) = ∏ P(Xi | (X j ) j∈ pa(i) ) .


i ∈V

It is the joint probability distribution represented by the Bayesian


network.

4
Example:
X1 P( X1 ) X2 P( X2 )

P( X3 | X1 )
X3 X4 P( X4 | X2 )

X5 P( X5 | X1 )
X6 P( X6 | X3 , X4 )

X7 X8 X9 P( X9 | X6 )
P( X7 | X5 ) P( X8 | X7 , X6 )

P( X1 , . . . , X9 ) =
= P( X9 | X8 , . . . , X1 ) · P( X8 | X7 , . . . , X1 ) · . . . · P( X2 | X1 ) · P( X1 )
= P( X9 | X6 ) · P( X8 | X7 , X6 ) · P( X7 | X5 ) · P( X6 | X4 , X3 )
· P( X5 | X1 ) · P( X4 | X2 ) · P( X3 | X1 ) · P( X2 ) · P( X1 )

5
Typical use of Bayesian networks
• to model and explain a domain.

• to update beliefs about states of certain variables when some


other variables were observed, i.e., computing conditional
probability distributions, e.g., P( X23 | X17 = yes, X54 = no).

• to find most probable configurations of variables

• to support decision making under uncertainty

• to find good strategies for solving tasks in a domain with


uncertainty.

6
Simplified diagnostic example
We have a patient.
Possible diagnoses: tuberculosis, lung cancer, bronchitis.

7
We don’t know anything about the pa- Patient is a smoker.
tient

8
Patient is a smoker. ... and he complains about dyspnoea

9
Patient is a smoker and complains ... and his X-ray is positive
about dyspnoea

10
Patient is a smoker and complains ... and he visited Asia recently
about dyspnoea and his X-ray is pos-
itive

11
Application 2:Decision making
The goal: maximize expected utility

Hugin example: mildew4.net

12
Fixed and Adaptive Test Strategies
Q1

Q2
Q5
wrong correct
Q3

Q4 Q8
Q4
wrong correct wrong correct
Q5
Q2 Q7 Q6 Q9
Q6 wrong correct wrong correct wrong correct wrong correct

Q7 Q1 Q3 Q6 Q8 Q4 Q7 Q7 Q10

Q8

Q9

Q10

13
X2
For all nodes n of a strategy s we
have defined:
X3
X1 • evidence en , i.e. outcomes of
X2
steps performed to get to node
X3
n,
X1
• probability P(en ) of getting to
X3
X2 node n, and
X1
• utility f (en ) being a real num-
X3
ber.
X1
Let L(s) be the set of terminal
X2
X3 nodes of strategy s.
X1
Expected utility of strategy is
X2
E f (s) = ∑`∈L(s) P(e` ) · f (e` ).

14
X2

X3
X1
X2

X3
Strategy s? is optimal iff it maxi-
mizes its expected utility.
X1

X3 Strategy s is myopically optimal iff


X2
X1
each step of strategy s is selected
X3
so that it maximizes expected utility
after the selected step is performed
X1
(one step look ahead).
X2
X3
X1

X2

15
Application 3: Adaptive test of basic
operations with fractions

Examples of tasks:
3 5 1 15 1 5 1 4 1

T1 : 4 · 6 − 8 = 24 − 8 = 8 − 8 = 8 = 2
1 1 2 1 3 1
T2 : 6 + 12 = 12 + 12 = 12 = 4
1
T3 : 4 · 1 21 = 1
4 · 3
2 = 3
8
1 1 1 1 1 2 2 1
 
T4 : 2 · 2 · 3 + 3 = 4 · 3 = 12 = 6 .

16
Elementary and operational skills
1
CP Comparison (common nu- 2 > 13 , 2
3 > 1
3
merator or denominator)
1 2 1+2 3
AD Addition (comm. denom.) 7 + 7 = 7 = 7
2 1 2−1 1
SB Subtract. (comm. denom.) 5 − 5 = 5 = 5
1 3 3
MT Multiplication 2 · 5 = 10
   
1 2 3 4
CD Common denominator 2, 3 = 6, 6
4 2·2 2
CL Cancelling out 6 = 2·3 = 3
7 3·2+1
CIM Conv. to mixed numbers 2 = 2 = 3 12

CMI Conv. to improp. fractions 3 12 = 3·22+1 = 7


2

17
Misconceptions
Label Description Occurrence
a c a+c
MAD b + d = b+d 14.8%
MSB a
− = ba−
c −c 9.4%
b d d
MMT1 a
b · bc = ab·c 14.1%
MMT2 a
b · bc = ab+·bc 8.1%
MMT3 a
b · dc = ab··dc 15.4%
MMT4 a
b · dc = ba+·cd 8.1%
MC a bc = ac·b 4.0%

18
Student model

HV2 HV1

ACMI ACIM ACL ACD

AD SB CMI CIM CL CD MT

CP

MAD MSB MC MMT1 MMT2 MMT3 MMT4

19
Evidence model for task T1
 
3 5 1 15 1 5 1 4 1
· − = − = − = =
4 6 8 24 8 8 8 8 2

T1 ⇔ MT & CL & ACL & SB & ¬ MMT3 & ¬ MMT4 & ¬ MSB

CL ACL MT

SB MMT4

MSB T1 MMT3

P(X1 | T1)

X1

Hugin: model-hv-2.net

20
Using information gain as the utility function
“The lower the entropy of a probability distribution the more we know.”

H ( P(X)) = − ∑ P(X = x) · log P(X = x)


x

entropy
0.5

0
0 0.5 1
probability

Information gain in a node n of a strategy

IG (en ) = H ( P(S)) − H ( P(S | en ))

21
Skill Prediction Quality
92
adaptive
average
90 descending
ascending

88
Quality of skill predictions

86

84

82

80

78

76

74
0 2 4 6 8 10 12 14 16 18 20
Number of answered questions

22
Application 4: Troubleshooting

Dezide Advisor customized to a specific portal, seen from the user’s


perspective through a web browser.

23
Application 2: Troubleshooting - Light print problem
Actions
Faults
A1
F1
A2
Problem F2
F A3
F3
Questions

F4 Q1

• Problems: F1 Distribution problem, F2 Defective toner, F3


Corrupted dataflow, and F4 Wrong driver setting.
• Actions: A1 Remove, shake and reseat toner, A2 Try another
toner, and A3 Cycle power.
• Questions: Q1 Is the configuration page printed light?

24
Troubleshooting strategy

A1 = yes A2 = yes

A1 = no A2 = no
Q1 = no A1 A2

Q1

A2 = no A1 = no
Q1 = yes A2 A1

A2 = yes A1 = yes

The task is to find a strategy s ∈ S minimising expected cost of repair

ECR (s) = ∑ P(e` ) · ( t(e` ) + c(e` ) ) .


`∈L(s)

25
Expected cost of repair for a given strategy

A1 = yes A2 = yes ECR (s) =



A1 = no A2 = no
P( Q1 = no, A1 = yes) · cQ1 + c A1
Q1 = no A1 A2 
+ P( Q1 = no, A1 = no, A2 = yes) · cQ1 + c A1 + c A2

Q1 + P( Q1 = no, A1 = no, A2 = no) · cQ1 + c A1 + c A2 + cCS

Q1 = yes A2
A2 = no
A1
A1 = no + P( Q1 = yes, A2 = yes) · cQ1 + c A2

+ P( Q1 = yes, A2 = no, A1 = yes) · cQ1 + c A2 + c A1
A2 = yes A1 = yes 
+ P( Q1 = yes, A2 = no, A1 = no) · cQ1 + c A2 + c A1 + cCS

Demo: www.dezide.com Products/Demo/‘‘Try out expert mode’’

26
Commercial applications of Bayesian networks in
educational testing and troubleshooting
• Hugin Expert A/S.
software product: Hugin - a Bayesian network tool.
http://www.hugin.com/
• Educational Testing Service (ETS)
the world’s largest private educational testing organization
Research unit doing research on adaptive tests using Bayesian
networks: http://www.ets.org/research/
• SACSO Project
Systems for Automatic Customer Support Operations
- research project of Hewlett Packard and Aalborg University.
The troubleshooter offered as DezisionWorks by Dezide Ltd.
http://www.dezide.com/

27

Você também pode gostar