TigerPOMDP · replay

event-log playback ·
Contestant
Hidden state
?
Belief tiger-left
50%
Listens
0
Reward
0
Step
0
Episode: #2 — clean win #34 — the growls lied #51 — long deliberation

Two doors — listen, believe, open

◀ tiger LEFTbelief b = P(tiger-left | growls)tiger RIGHT ▶
ring (history: 2):
listening…

Return distribution — partial observability is bimodal

treasure +10 tiger −100 growl / belief left growl / belief right space play · step