States with varying applicable actions #338
-
What's the best way to describe a POMDP with POMDPs.jl where only a subset of the possible actions can be valid for each state? For example, in a pick and place domain with possible actions = {pick, place, push}, once the robot is holding an object, the only valid subset of actions it can take in that state is {place}. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 6 replies
-
function POMDPs.actions(m::PickNPlace, s)
if holding(s)
return [:place]
else
return [:pick, :place, :push]
end
end In a POMDP rather than an MDP, the second argument will actually be a belief instead of a state, but the robot should know with certainty whether it is holding something, so you should be able to get the relevant information from the belief. Most solvers should work with this, but a few may not, so you may have to add a penalty in the reward function to penalize invalid actions. Let us know if you have any follow-up questions. |
Beta Was this translation helpful? Give feedback.
POMDPs.actions
has an optional state or belief argument.In a POMDP rather than an MDP, the second argument will actually be a belief instead of a state, but the robot should know with certainty whether it is holding something, so you should be able to get the relevant information from the belief. Most solvers should work with this, but a few may not, so you may have to add a penalty in the reward function to penalize invalid actions. Let us know if you have any follow-up questions.