Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dealing with mjWARN_BADQACC #788

Open
vittorione94 opened this issue Oct 20, 2022 · 1 comment
Open

dealing with mjWARN_BADQACC #788

vittorione94 opened this issue Oct 20, 2022 · 1 comment

Comments

@vittorione94
Copy link

Hello,

This is more a feature enhancement. For environments that are unstable i.e. that rarely return a mjWARN_BADQACC would be great if instead of abruptly stopping the training would return a termination state.

I'm thinking wrapping this line in a try catch

next_time_step = self.env.step(action_step.action)

What do you think is it something that would make sense?

Best,
-Vittorio

@vittorione94
Copy link
Author

vittorione94 commented Oct 25, 2022

At the end I'm using this fix:

    try:
      ts = convert_time_step(self._env.step(action))
    except:
      observation = self._env._task.get_observation(self.physics)
      reward = 0.
      discount = 0.9
      ts = dm_env.TimeStep(
          dm_env.StepType.LAST, reward, discount, observation)
      ts = convert_time_step(ts)
    return ts

inside dm_control_wrapper, this is the line I modified:

return convert_time_step(self._env.step(action))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant