You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from tf_agents.policies.random_tf_policy import RandomTFPolicy
r_policy = RandomTFPolicy(ts_spec, act_spec)
r_policy.action(ts)
and pure q-policy
from tf_agents.networks import q_network
from tf_agents.policies import q_policy
q_net = q_network.QNetwork(obs_spec,act_spec)
q_pol = q_policy.QPolicy(ts_spec,act_spec,q_net)
q_pol.action(ts)
the resulting action tensor has shape=(1,1).
This issue results in an error when running categorical-q-agent's collect policy, where epsilon_greedy needs to switch between categorical_q_policy and random policy:
Categorical_q_policy's action tensor is missing a dimension compared to policies with same tensor specs.
For the setup
the resulting action tensor has shape=(1,)
While for random policy
and pure q-policy
the resulting action tensor has shape=(1,1).
This issue results in an error when running categorical-q-agent's collect policy, where epsilon_greedy needs to switch between categorical_q_policy and random policy:
==> InvalidArgumentError: Inputs to operation Select of type Select must have the same size and shape. Input 0: [1] != input 2: [1,1] [Op:Select]
The text was updated successfully, but these errors were encountered: