EnvironmentSteps tf_metric bug with parallel envs #790

vittorione94 · 2022-10-25T16:30:20Z

I think there's an error when using the tensor flow EnvironmentSteps metric.

Let's say we're using parallel environment (with 10 envs) and setting collect_steps_per_iteration (to 5) in a DynamicStepDriver.
I would expect the metric to return 50 after driver finished the run function, but it returns 10. To debug this easily, try an example file like ddpg and set these two parameters. However, it works fine if I'm using only one env (not parallel) it returns correctly 5.

Could anyone look into this? Or explain me if there's something wrong with my reasoning?

Best,
-Vittorio

coreyleveen · 2022-11-05T02:01:23Z

I believe the metric is only keeping track of train steps, rather than steps collected by the driver. This makes sense because if your initial collect driver ran, for example, 100,000 steps to partially fill your replay buffer, and then the EnvironmentSteps metric displayed 100,000 steps before the agent even began training, this could be misleading.

Perhaps try changing train_steps_per_iteration to 5 as well and see if that leads to the change in the metric value you expected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EnvironmentSteps tf_metric bug with parallel envs #790

EnvironmentSteps tf_metric bug with parallel envs #790

vittorione94 commented Oct 25, 2022

coreyleveen commented Nov 5, 2022

EnvironmentSteps tf_metric bug with parallel envs #790

EnvironmentSteps tf_metric bug with parallel envs #790

Comments

vittorione94 commented Oct 25, 2022

coreyleveen commented Nov 5, 2022