You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently dealing with "agents/tf_agents/bandits/" . I am wondering where or if the classic Contextual Bandit off-policy evaluation procedures are present in Tensorflow.I mean exactly the following off-policy evaluation procedures:
Before I start thinking about how to integrate the methods from obp in the tensorflow environment, I would like to know if and where these methods can be found at TF Agents.
The text was updated successfully, but these errors were encountered:
Hi,
I am currently dealing with "agents/tf_agents/bandits/" . I am wondering where or if the classic Contextual Bandit off-policy evaluation procedures are present in Tensorflow.I mean exactly the following off-policy evaluation procedures:
I mean the evaluation procedures that vowpal_wabbit already uses. Can be found here:
https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/tutorials/python_Contextual_bandits_and_Vowpal_Wabbit.html
Or even more desirable, methods which we can find at the package Open Bandit Pipeline:
https://github.com/st-tech/zr-obp
Before I start thinking about how to integrate the methods from obp in the tensorflow environment, I would like to know if and where these methods can be found at TF Agents.
The text was updated successfully, but these errors were encountered: