-
Hello, I am new in this field and I would like you to ask you 2 questions that right now are very important for me: Is it possible to use state-dependent actions with these solvers (https://github.com/JuliaPOMDP/TabularTDLearning.jl) ? Maybe a function that given a state returns a vector with all the possible actions? How does the generative model really works? I mean if I use gen = function (s, a, rnd) it takes the initial state and then tries all the possible actions of the actions space according to their keys? I want to know if actions are selected in order or are chosen randomly. I hope you can help me, thank you so much. Best regards. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 23 replies
-
Hi @clrescobar, Unfortunately, The A quick side note: In practice you should never have to call |
Beta Was this translation helpful? Give feedback.
Hi @clrescobar,
Unfortunately,
TabularTDLearning
does not currently support state-dependent actions since that would make the table "non square". That said, it would be possible to make that extension by using another data structure for the Q-values inTabularTDLearning
. That, however, would require minor changes to the solver/code over there.The
gen
function implements the (PO)MDP transition (and observation) model in a generative representation. That is, rather than having to provide probability densities over states, observations, and rewards viaT
,Z
, andR
, you just have to implementgen
as a function to sample from the joint distribution of the three. Therefore, when you callgen(s…