Skip to content

mountain car pomcp oscillating solution #435

Discussion options

You must be logged in to vote

Hi @gdaddi,

The problem is likely that POMCP is just not able to solve this problem very well without a decent rollout policy.

Since the rewards for the problem are very sparse, the tree search likely never finds any rewards in the tree if the rollout policy is random. You could try to use a rollout policy that always adds energy so that the rollouts will reach the goal.

Let me know if this explanation makes sense.

Replies: 1 comment 6 replies

Comment options

You must be logged in to vote
6 replies
@zsunberg
Comment options

@gdaddi
Comment options

@zsunberg
Comment options

@gdaddi
Comment options

@zsunberg
Comment options

Answer selected by gdaddi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants