Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy's action with an action mask not respecting its min/max bounds #295

Open
alfoudari opened this issue Jan 31, 2020 · 1 comment
Open
Assignees
Labels
level:p2 type:bug Something isn't working

Comments

@alfoudari
Copy link

alfoudari commented Jan 31, 2020

The following unit test fails by producing actions that are out of the action_spec min/max boundaries:

https://github.com/abstractpaper/tf-agents/blob/d06b2fcf4fecb3e052224dd257ad3e1b9ef7d4b0/tf_agents/policies/q_policy_test.py#L234-L269

Running the unit test:

(.venv) aziz@merlin:~/code/agents$ python -m unittest tf_agents.policies.q_policy_test.QPolicyTest.testActionWithinBoundsWithMasking
culprits: [(2, 8), (264, 8), (822, 8), (881, 8)]
F
======================================================================
FAIL: testActionWithinBoundsWithMasking (tf_agents.policies.q_policy_test.QPolicyTest)
testActionWithinBoundsWithMasking (tf_agents.policies.q_policy_test.QPolicyTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/aziz/code/agents/tf_agents/policies/q_policy_test.py", line 269, in testActionWithinBoundsWithMasking
    self.assertTrue(np.all(action >= 0) and np.all(action < num_actions))
AssertionError: False is not true

----------------------------------------------------------------------
Ran 1 test in 0.115s

FAILED (failures=1)

When I don't specify an action mask by removing observation_and_action_constraint_splitter, it passes:

policy = q_policy.QPolicy(ts.time_step_spec(input_tensor_spec), action_spec, q_net)
(.venv) aziz@merlin:~/code/agents$ python -m unittest tf_agents.policies.q_policy_test.QPolicyTest.testActionWithinBoundsWithMasking
culprits: []
.
----------------------------------------------------------------------
Ran 1 test in 0.051s

OK

Is this behavior expected?

@tfboyd
Copy link
Member

tfboyd commented Feb 5, 2020

Ok, I feel silly. This is not a failing unit test. You created a unit test to illustrate the question regarding some behavior. Eugene will have to handle that question.

@tfboyd tfboyd added level:p2 and removed level:p1 labels Feb 5, 2020
@tfboyd tfboyd removed their assignment Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
level:p2 type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants