Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Projection Network for more than 1 action with differing action spaces #694

Open
3 tasks done
sidney-tio opened this issue Jan 3, 2022 · 5 comments
Open
3 tasks done

Comments

@sidney-tio
Copy link

sidney-tio commented Jan 3, 2022

Following the discussion from #37, I developed a MultiCategoricalProjectionNetwork that splits logits and creates the respective Categorical distribution. I tried to adhere to the same pattern as far as I could; It can be found here: https://gist.github.com/sidney-tio/66abada949f1b629dd9ee28777d402d5

If the team would like, I could raise a PR based on the gist I developed. From what I see, these are the to-dos to make it PR-worthy:

  • add tests
  • add more detailed docstrings
  • add masks
@sguada
Copy link
Member

sguada commented Jan 4, 2022

Have you tried instead having a nested action space? In which each action can have different number of actions?

@sidney-tio
Copy link
Author

No, I don't think I have tried that. I assume you are referring to something like a gym.spaces.Dict type of nested structure where we could specify {'action1': 4, 'action2': 3}? Could you elaborate further?

@sguada
Copy link
Member

sguada commented Jan 12, 2022

Yeah you can use gym.spaces.Dict or directly nested ArraysSpecs to define the actions. Then each one can have their own Categorical distribution and sampling will sample all of them.

@sidney-tio
Copy link
Author

my current workflow was to generate a spec from a gym.spaces.MultiDiscrete instance before creating the network.

I can see why something like a nested action space would be useful. I also just tried from a gym.spaces.Dict; would need loop through the iterable before generating the respective Categorical distributions.

i'll add a function to check for iterable and extract the relevant information. let me make the changes and, if its okay, I will raise a draft PR

@sidney-tio
Copy link
Author

hello, not sure if it was missed, but the PR for this issue is up. Could I request for a review please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants