dealing with mjWARN_BADQACC #788

vittorione94 · 2022-10-20T23:58:54Z

Hello,

This is more a feature enhancement. For environments that are unstable i.e. that rarely return a mjWARN_BADQACC would be great if instead of abruptly stopping the training would return a termination state.

I'm thinking wrapping this line in a try catch

agents/tf_agents/drivers/dynamic_step_driver.py

Line 137 in a8bd037

next_time_step = self.env.step(action_step.action)

What do you think is it something that would make sense?

Best,
-Vittorio

vittorione94 · 2022-10-25T16:34:48Z

At the end I'm using this fix:

    try:
      ts = convert_time_step(self._env.step(action))
    except:
      observation = self._env._task.get_observation(self.physics)
      reward = 0.
      discount = 0.9
      ts = dm_env.TimeStep(
          dm_env.StepType.LAST, reward, discount, observation)
      ts = convert_time_step(ts)
    return ts

inside dm_control_wrapper, this is the line I modified:

agents/tf_agents/environments/dm_control_wrapper.py

Line 91 in 6d0ff50

return convert_time_step(self._env.step(action))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dealing with mjWARN_BADQACC #788

dealing with mjWARN_BADQACC #788

vittorione94 commented Oct 20, 2022

vittorione94 commented Oct 25, 2022 •

edited

Loading

dealing with mjWARN_BADQACC #788

dealing with mjWARN_BADQACC #788

Comments

vittorione94 commented Oct 20, 2022

vittorione94 commented Oct 25, 2022 • edited Loading

vittorione94 commented Oct 25, 2022 •

edited

Loading