You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File "/x/lib/python3.10/site-packages/tf_agents/policies/greedy_policy.py", line 58, in __init__
emit_log_probability=policy.emit_log_probability,
File "/x/python3.10/site-packages/tf_agents/policies/py_tf_eager_policy.py", line 246, in __getattr__
return getattr(self._policy, name)
AttributeError: '_UserObject' object has no attribute 'emit_log_probability'
In call to configurable 'GreedyPolicy' (<class 'tf_agents.policies.greedy_policy.GreedyPolicy'>)
In call to configurable 'EpsilonGreedyPolicy' (<class 'tf_agents.policies.epsilon_greedy_policy.EpsilonGreedyPolicy'>)
In call to configurable 'collect' (<function collect at 0x2b33a2607490>)
Is there a simple way to implement what I want to do?
The text was updated successfully, but these errors were encountered:
The DdqnAgent initializer, as with the DqnAgent, accepts an epsilon_greedy argument. You can see in the code here that it uses an EpsilonGreedyPolicy for its collect policy, provided no Boltzmann temperature is provided:
Hi,
I am following https://github.com/tensorflow/agents/tree/master/tf_agents/experimental/distributed/examples/sac to implement an Actor-Learner environment for DDQN (similar to Ape-X, page 6). How can I adapt the actor-policies that get their policy variables from the reverb variable container? I would like to use a different epsilon-greedy policy for each actor.
Actors are differentiated by
FLAGS.task
in the example. I tried to pass an Epsilon-Greedy Polciy toactor.Actor.policy
but that resulted in the following error:
Is there a simple way to implement what I want to do?
The text was updated successfully, but these errors were encountered: