Você está na página 1de 5

s 0 S

S G S

S G

A A(s) s S

P (·|s, a) S a A(s) s S

c(a, s) a s

π : S A V π (s)

S

V π (s) = c(π(s), s) + P(s |s, π(s)) V π (s )

s S

V (s)

V (s) =

aA(s) (c(a, s) + P(s |s, a) V (s ))

min

sS

S 0 S

95

π (s)

π (s) = argmin (c(a, s) + P(s |s, a) V (s ))

aA(s)

sS

V π (s) = V (s) = 0 S G

V π (s) π V

π V := argmin (c(a, s) + P(s |s, a) V π (s ))

aA(s)

sS

P (·|s, π V (s)) V π (s)

V π (s) V (s) V π (s)

V π (s)

96

V π (s) =

V (s)

P (·|s, a) V π (s) V (s)

E(s) := |V π (s)min

aA(s) (c(a, s)+ P(s |s, a)V π (s ))|

s S

> 0

1 sS V (s)h(s) h(s)

V π (s)

aberto

=

P ilhaV azia

s = aberto.pop()

fechado.push(s) s.erro(a) > resposta = f also

a = s.acaoGulosa()

s P (s |s, a) > 0 ¬s .M arcado ¬Em(s , aberto fechado) aberto.push(s )

=

P ilhaV azia

resposta

97

=

P ilhaV azia

V L (s) V π (s) V U (s) > 0

(s )V L (s ))

P (s |s, a)(V U

V U (s 0 )V L (s 0 ) <

V U (s) V L (s) V π (s)

98

V

V

U (s)

=

L (s) =

aA(s) (c(a, s) + P(s |s, a) V U (s ))

min

s S

aA(s) min (c(a, s) + P(s |s, a) V L (s ))

s S

S

s 0 V (s) h(s) = 0 h(s) = min aA(s) c(a, s)

99