You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think there's an error when using the tensor flow EnvironmentSteps metric.
Let's say we're using parallel environment (with 10 envs) and setting collect_steps_per_iteration (to 5) in a DynamicStepDriver.
I would expect the metric to return 50 after driver finished the run function, but it returns 10. To debug this easily, try an example file like ddpg and set these two parameters. However, it works fine if I'm using only one env (not parallel) it returns correctly 5.
Could anyone look into this? Or explain me if there's something wrong with my reasoning?
Best,
-Vittorio
The text was updated successfully, but these errors were encountered:
I believe the metric is only keeping track of train steps, rather than steps collected by the driver. This makes sense because if your initial collect driver ran, for example, 100,000 steps to partially fill your replay buffer, and then the EnvironmentSteps metric displayed 100,000 steps before the agent even began training, this could be misleading.
Perhaps try changing train_steps_per_iteration to 5 as well and see if that leads to the change in the metric value you expected.
I think there's an error when using the tensor flow EnvironmentSteps metric.
Let's say we're using parallel environment (with 10 envs) and setting collect_steps_per_iteration (to 5) in a DynamicStepDriver.
I would expect the metric to return 50 after driver finished the run function, but it returns 10. To debug this easily, try an example file like ddpg and set these two parameters. However, it works fine if I'm using only one env (not parallel) it returns correctly 5.
Could anyone look into this? Or explain me if there's something wrong with my reasoning?
Best,
-Vittorio
The text was updated successfully, but these errors were encountered: