Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request - Hyperparameter Tuning Framework #86

Closed
JaCoderX opened this issue Dec 8, 2018 · 3 comments
Closed

Feature Request - Hyperparameter Tuning Framework #86

JaCoderX opened this issue Dec 8, 2018 · 3 comments

Comments

@JaCoderX
Copy link
Contributor

JaCoderX commented Dec 8, 2018

@Kismuz I'm been doing quite a bit of research and experimentation with BTGym lately. You have built a really impressive framework that is very rich and interesting to work with. There is so many possibilities and research direction that are already present in the library to explore and play with.

The system have A lot of moving parts, so I started to look for ways I can boost my experimentation and exploration of different architecture and hyperparameter in an easier way.

DeepMind had proposed a framework to efficiently explore and exploit the hyperparameter space (mentioned in #82 under Population Based Training).

Ray 'Tune' library have this framework already implemented and ready for integration for RL projects. General integration steps are as follows:

  1. Run the Tune service
  2. Give access to the hyperparameter you want Tune to have control over (example: learn_rate = tune_config['lr'])
  3. Register you training class/function to Tune
  4. Give access to evaluation metrics so hyperparameter can have result to optimize upon (loss, accuracy...)
  5. Config the Tune framework parameters
  6. Start training

For small projects integration is straight forward for BTGym, I tried to examin the code and it is seem to be more complex.

Ideally, we can have a tune_config in the launcher that control hyperparameters for the other configs (env_config, policy_config, trainer_config, cluster_config) so we can dynamically control which hyperparameters we want to be fixed and others that we want the system to explore.

A section to control the Tune Trial Schedulers parameters:

  1. what framwork to use? (Population Based Training or other offered by Tune)
  2. the method hyperparameters are updated (random, Bayesian... from what distribution range...)

And finally we need a way for the launcher.run() to properly interact with Tune.run_experiments(...)

an example from Ray Tune can be found here .

@Kismuz, if it's something you think is worth and possible to implement and we can come up with a good design I can try to implement it

@Kismuz
Copy link
Owner

Kismuz commented Dec 9, 2018

@JacobHanouna, smart h.parms search is an excellent idea but extremely computationally expensive in DRL case; note a chilling comment in example code you pointed at :) :

Note that this requires a cluster with at least 8 GPUs in order for all trials
to run concurrently, 
....

I'll take a closer look to see what can be done here but no earlier than in 3 - 5 days / bit busy developing combined model-based/model-free approach which looks very promising.

@JaCoderX
Copy link
Contributor Author

JaCoderX commented Dec 9, 2018

@Kismuz actually PBT shouldn't be so computationally expensive. This was part of the objectives of the DeepMind team when they created this search optimization framework.

the idea was to look for balance between random search that is sequential and require many iterations but trivial to select hyperparameters, and the use search of Bayesian optimizer that is heavy on computational cost for selecting the hyperparameters but works parrallel.

PBT use random hyperparameters selection but in a smart way. It compares the performance of the best current models and replacing bad performing ones with new random hyperparameters that are close to the good performing model. A smart random evolutionary optimizer

@Kismuz
Copy link
Owner

Kismuz commented Dec 9, 2018

@JacobHanouna, agree; I already run through the paper and idea is captivating indeed; I'll take a closer study and respond on.

@Kismuz Kismuz closed this as completed Jul 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants