You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am looking to use tf-agents to develop a multi armed bandit for advertising.
For each observation, I don't have the reward for other arms, because I'll only show that single arm to the observation.
Is tf-agents able to handle such situations? I went through all the Environments and all of them seem to assume that rewards are available for each observation-arm combination. The MovieLens example is handling sparsity using SVD.
Will I need to use similar methods to estimate the reward for other arms? or is there something in tf-agents that I am missing out on?
The text was updated successfully, but these errors were encountered:
Hi,
I am looking to use
tf-agents
to develop a multi armed bandit for advertising.For each observation, I don't have the reward for other arms, because I'll only show that single arm to the observation.
Is
tf-agents
able to handle such situations? I went through all the Environments and all of them seem to assume that rewards are available for each observation-arm combination. The MovieLens example is handling sparsity using SVD.Will I need to use similar methods to estimate the reward for other arms? or is there something in
tf-agents
that I am missing out on?The text was updated successfully, but these errors were encountered: