You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using the add_batch method from PyUniformReplayBuffer throws an IndexError: tuple index out of range. When you do something simliar with an TFEnv all the trajectories are batched but this is not the case in PyEnv. I think that is why the observer lambda x: buffer.add_batch(batch_nested_array(x)) works and the observer buffer.add_batch doesn't. Bellow are some examples codes.
I think the best way to solve this is batching the trajactory in PyDriver to stay consistent with the TfDriver. This would only result into a single change in the .run() method from PyDriver.
original:
Yes sure this would work but my issue is about the inconsistency in the API. For a TFEnv (batches by default) it is not needed but it is needed for a PyEnv (doesn't batch by default), this is quite confusing. Personally this took me quite a while to understand what cause this issue.
Using the
add_batch
method fromPyUniformReplayBuffer
throws anIndexError: tuple index out of range
. When you do something simliar with anTFEnv
all the trajectories are batched but this is not the case inPyEnv
. I think that is why the observerlambda x: buffer.add_batch(batch_nested_array(x))
works and the observerbuffer.add_batch
doesn't. Bellow are some examples codes.This doesn't work:
This works:
I think the best way to solve this is batching the trajactory in
PyDriver
to stay consistent with theTfDriver
. This would only result into a single change in the.run()
method fromPyDriver
.original:
Proposed replacement:
The text was updated successfully, but these errors were encountered: